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ABSTRACT 

Aims. Cosmic shear, the gravitational lensing on cosmological scales, is regarded as one of the most powerful probes for revealing 
the properties of dark matter and dark energy. To fully utilize its potential, one has to be able to control systematic effects down to 
below the level of the statistical parameter errors. Particularly worrisome in this respect is the intrinsic alignment of galaxies, causing 
considerable parameter biases via correlations between the intrinsic ellipticities of galaxies and the gravitational shear, which mimic 
lensing. Since our understanding of the underlying processes of intrinsic alignment is still poor, purely geometrical methods are 
required to control this systematic. In an earlier work we proposed a nulling technique that downweights this systematic, only making 
use of its well-known redshift dependence. We assess the practicability of nulling, given realistic conditions on photometric redshift 
information. 

Methods. For several simplified intrinsic alignment models and a wide range of photometric redshift characteristics, we calculate an 
average bias before and after nulling. Modifications of the technique are introduced to optimize the bias removal and minimize the 
information loss by nulling. We demonstrate that one of the presented versions of nulling is close to optimal in terms of bias removal, 
given the high quality of photometric redshifts. Although the nulling weights depend on cosmology, being composed of comoving 
distances, we show that the technique is robust against an incorrect choice of cosmological parameters when calculating the weights. 
Moreover, general aspects such as the behavior of the Fisher matrix under parameter-dependent transformations and the range of 
validity of the bias formalism are discussed in an appendix. 

Results. Given excellent photometric redshift information, i.e. at least 10 bins with a dispersion <x p h < 0.03, a negligible fraction of 
catastrophic outliers, and precise knowledge about the bin-wise redshift distributions as characterized by a scatter of 0.001 or less on 
the median redshifts, one version of nulling is capable of reducing the shear-intrinsic ellipticity contamination by at least a factor of 
100. Alternatively, we describe a robust nulling variant which suppresses the systematic signal by about 10 for a very broad range 
of photometric redshift configurations, provided basic information about <x p h in each of > 10 photometric redshift bins is available. 
Irrespective of the photometric redshift quality, a loss of statistical power is inherent to nulling, which amounts to a decrease of the 
order 50 % in terms of our figure of merit under conservative assumptions. 

Key words, cosmology: theory - gravitational lensing - large-scale structure of the Universe - cosmological parameters - methods: 
data analysis 
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orientation of the galaxy image with respect to a reference axis. 
In the approximation of weak lensing e can be written as the sum 
of the intrinsic ellipticity e s of the galaxy and the gravitational 
shear y. Applying this relation, the correlator of ellipticities for 
two galaxy populations i and j reads 



fa) = ( r/r} ) + (^) + ( r ,^) + (^) 



(1) 



GG 



II 



GI 



If one assumes that the intrinsic ellipticities of galaxies are ran- 
domly oriented in the sky, only the desired lensing (GG) term 
remains on the right-hand side. However, when galaxies are sub- 
ject to the tidal forces of the same matter structure, their shapes 
can intrinsically align and become correlated, thus causing a 
non-vanishing II term. Moreover, a matter overdensity can align 
a close-by galaxy and at the same time contribute to the lensing 
signal of a background object, which results in non-zero corre- 
lations bet ween gravitational s hear and intrinsic ellipticities or a 
GI term (iHirata & Seliakll200l HS04 hereafter). 

The alignment of dark matter haloes, resulting from exter- 
nal tidal forces, has been subject to extensive study, both ana- 
lytic and nu merica l dCroft & Metzlerl2000tlHeavens et al.[2 000 ; 
Lee & Pen! l2000t ICatelan et alJ 120011: ICrittenden et allhooit 



Jind 120021: iMackev et alJ l2002p HS04: Brid ie & Abdalla 2007; 



Schneider & Bridle 2009). The galaxies in turn are assumed to 



align with the angular momentum vector (in the case of spi- 
ral galaxies) or the shape of their host halo (in the case of el- 
liptical galaxies), which is suggested b y the observed correla- 
tions of galaxy spins (e.g.lPen et al.l2000t) and galaxy ellipticities 
(e.g. lBrainerd et al.|2009l). However, this alignm e nt is not perfect 
- see for in stance Ivan den Bosch et ail d2002l) . lOkumura et alJ 
(2009), and lOkumura & Jind t2009). The intrinsic correlations 
of galaxy properties cause non-zero II and GI signals, as obser - 
vationally verified in s e veral surveys by e.g.lBrown et aTl (12 002). 
iHevmans et"ai1 d2004l). iMandelbaum et al.l 00061) . IHirata et all 
d2007l) . and lBrainerd et al l (120091) . 

Observations as well as predictions from theory are consis- 
tent with a contamination of the order of 10 % by both II and 
GI signal for future cosmic shear surveys, which makes the con- 
trol of these systematics crucial. However, analytic progress to 
calculate intrinsic alignment correlations beyond linear theory 
is cumbersome, and the inclusion of gas physics to fully simu- 
late the formation and evolution of galaxies in their da rk matter 
haloe s is computationally still too expensive (see e.g. ISchaeferl 
2008 for a review on the work about galaxy spin correlations), so 
that for the time being our understanding of intrinsic alignment 
remains at the level of toy models. 

Hence, removal techniques should rely on intrinsic align- 
ment models as little as possible. The II signal is relatively 
straightforward to eliminate because it is restricted to pairs of 
galaxies that are physically close to each o ther, both galaxies 
being affected by the same matter structure (|Kin g & Schneider 
2002. l2003UHevmans & Heavensll2003l: Ffak ada & W hitdl2004l) . 
For an application of the II removal to the COMBO- 17 survey 
see lHevmans et alJ d2004 . 

First ideas how to contr ol the GI signal were already put for- 
ward by HS04. King (2005) uses a set of template functions to fit 
the lensing and intrinsic alignment signals simultaneously, mak- 
ing use of their different dependence o n angular scales and red- 
shift. Similarly, iBridle & Kingl (120071) investigate the effect of 
the GI term on parameter constraints by binning the systematic 
signal in angular frequency and redshift with free parameters, 
which are then marginalized over. In both approaches an intrin- 
sic alignment toy model is used as fiducial model. Increasing 



freedom in the representation of the GI signal is achieved at the 
cost of a bigger number of nuisance parameters, which dilutes 
the cosmological information that can be extracted from the data. 

In addition to ellipticity correlations one can also measure 
galaxy densities in cosmic shear surveys, so that ellipticity- 
density and density-density correlations can be added to the 
data analysis. This information is the n used to self-calibrate sys- 
temat i c effects of we ak lensing (e.g. lHu &~J ain 2004: iBernsteinl 
2008]).|Zh ang (2008) applies the self-calibration technique to the 
GI contamination, deriving an approximate relation between GI 
and the galaxy density-intrinsic ellipticit y correlations. 

In a purely geometric approach Joachimi & Schn eider! 
(2008), JS08 hereafter, have presented a technique to null the 
GI signal, based exclusively on weak lensing data. Making use 
of the characteristic dependence on redshift, new cosmic shear 
measures are constructed that are completely free of any possi- 
ble GI systematic, given perfect redshift information. In a case 
study it was shown in JS08 that for more than about 10 red- 
shift bins up to z = 4, still without photometric redshift errors, 
the nulling technique only moderately widens parameter con- 
straints. To demonstrate its practicability, it is vital to assess the 
performance of nulling in presence of photometric redshift inac- 
curacies and to quantify the actual suppression of the GI signal 
since the removal is not necessarily perfect as idealized assump- 
tions in the derivation of the method have been made. It is the 
scope of this work to investigate the modification of statistical 
and systematic errors by the nulling technique in a more real- 
istic setup, including photometric redshift errors. Furthermore, 
we are going to provide minimum requirements on the quality 
of redshift information to be able to practically apply nulling. 

The paper is structured as follows: In Sect.|2]we review the 
nulling technique, slightly modifying the approach to further 
simplify notation and usage. Moreover, we give an overview on 
the Fisher matrix and bias formalism in the context of the data 
transformation that corresponds to nulling. Section[3] summa- 
rizes our model specifications concerning photometric redshift 
errors, lensing data, and intrinsic alignment signals. We deter- 
mine the nulling parameters such that the corresponding trans- 
formation removes a maximum of systematic signal in Sect.|4] 
Besides, we address the dependence of the nulling weights on 
cosmology. In Sect.|5]the performance of nulling in terms of pho- 
tometric redshift binning is elaborated on, leading to considera- 
tions of the minimum information loss of this technique. In ad- 
dition, we develop a weighting scheme to control intrinsic align- 
ment contamination, not eliminated by nulling itself. Section [6] 
deals with the effect of photometric redshift uncertainty and as- 
sesses to what extent the chosen nulling versions are optimal. 
The influence of catastrophic outliers in and of uncertainty in the 
parameters of the redshift distributions is quantified in Sect.|7] In 
Sect.|8]we summarize our findings and conclude. The appendices 
provide a discussion of parameter-dependent transformations of 
the Fisher matrix and a formal derivation of the bias formalism, 
including an assessment of its validity. 



2. Method 

2.1. Nulling technique 

We briefly review the principles of the nulling technique as pre- 
sented in JS08 and develop a compact formalism. As before, 
we restrict our considerations to Fourier space by using power 
spectra as the cosmic shear measures, but it is straightforward 
to implement the formalism in terms of any of the second-order 
real-space measures. Throughout the paper a spatially flat uni- 
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verse is assumed. F or recent reviews on weak lensing see e.g. 
iMunshi et alJ (120081) for theoretical issues and Bj oekstra & Jainl 
d2008l) who focus on observational aspects: Heavens! (12008 ) pro- 
vides a concise overview. We largely follow the notation of 
ISchneiderl(l2006l) . 

Consider a cosmic shear survey that is divided into N z red- 
shift slices by means of photometric redshift information, yield- 
ing a data set of tomography convergence power spectra P^2(€), 
where the indices i and j run from 1 to N : , and where the angu- 
lar frequency I denotes the Fourier variable on the sky. We use 
the convention that in the superscript of the power spectra the 
first bin refers to the redshift distribution with lower median red- 
shift, i.e. i < j. The convergence power spectra are radial projec- 
tions of the three-dimensional power spectrum of matter density 
fluctuations P ^ as given by Limber's equation in Fourier space 
dKaiserlll992l) . 



9H*f& rx> 



4c 4 



f 

Jo 



&x 8 & 0d 8 U) (x) U + z(x)¥ p& 
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~>x 

X . 



■(2) 



Here and in the following, the dependence of the power spectra 
on time is encoded in the second argument, respectively. The 
redshift is denoted by z, while x is the comoving distance, with 
its maximum at the comoving horizon distance x^aox- These two 
quantities are related via the distance-redshift relation 



X(z) 



"0 Jo 



1/2 



(3) 



where Qde(z) = ^de,o in case of a cosmological constant. The 
parametrization of £2de(z) in a universe with variable dark en- 
ergy is given in Sect. 13. 21 The weighting in the projection (0, 
specific to weak gravitational lensing, is the lensing efficiency 



,(0 



(X) 



= fV P «V>(i-i) 



(4) 



where p^(x) is the normalized probability distribution of co- 
moving distances of a galaxy population i. Hence, the lensing ef- 
ficiency corresponds to the ratio D^/D^ of the angular diameter 
distance between lens and source and the one between observer 
and source, averaged over the source distances of the galaxy pop- 
ulation i. 

Intrinsic alignment leads to correlations between the intrinsic 
ellipticities of galaxies and between intrinsic ellipticity and grav- 
itational shear, thereby adding a systematic signal to the lensing 
observables @. In analogy to (0, the II and GI power spectra 
can be written as (HS04) 



Pf{€) = J*"' d X P (i) (x) P U) (x)X- 2 P r y[- \ 1 : (?) 



3// 2 o m r*> 



dX {p ( ' ) (x)8 ij) (x)+8 i '\x)p U \x)) 
t 



2c 2 



x {l+z(x)} X~ l P^\-'X 



(6) 



In order to define the three-dimensional power spectra employed 
here, we write e s = y l + e md , i.e. the intrinsic ellipticity is split up 
into the contributions by an intrinsic shear field y l (x) that con- 
tains the intrinsic alignment effects, being continuous as a func- 
tion of position vector x, and a purely random component e rnd . 
The latter term is correlated neither with gravitational or intrinsic 
shear, nor with e lnd of other galaxies. Analogously to the lens- 
ing case one can introduce an intrinsic convergence a- 1 such that 



^(k) = y l {k) e 2m , where the tilde denotes the Fourier trans- 
form, and where tp^ is the azimuthal angle of the wave vector 
k. 

Then one defines the intrinsic shear E-mode power spectrum 
P y \y\ and the matter-intrinsic shear cross-power spectrum P Sy i as 

~£(k',X)) = (2?r) 3 $\k - k')P y y(k,x) , (7) 

(6(k,x) ~^{k',x)) = (2nf 6$\k - k')P sf {k, X ) , (8) 

where 6d is the Dirac delta-distribution. In analogy to (0 a B- 
mode intrinsic shear pow er spectrum can be defined as well 
dSchneider & Bridlel 2009). The cross-power spectra between in- 
trinsic shear E- and B-mode (i< l E (k,x) K^(k',x)) an d between 

matter and intrinsic shear B-mode (s(k,x) K l ^(k',x)) should 
vanish if one demands p arity invariance of the intrinsic shear 
field (see lSchneiderll2003l) . 

To see the equivalence between the definition in ([8]) and the 
one in HS04, consider the Fourier transform of the correlator 
(d(0,x) y\(x,x))> which is given by 

xcos(2^,) (§(k',x)^E(k,x)) , 

where it was assumed that the +-component of the intrinsic shear 
is measured along JCj_, the transverse separation component of 
the position vector x. Inserting (0 and integrating along the line 
of sight, one obtains 

j d*u (s(0,x) yi(x,x)) = -J ^h(kx x ) JVC*.*) • ( 10 ) 

where the definition of the second-order Bessel function of the 
first kind, written as J2, was employed in addition. By making 
use of the orthogonality relations of Bessel functions, one arrives 
at the defining equation of P 5y \ in HS04, Eq. 12. 

Note that HS04 account for source clustering by using the 
weighted intrinsic shear y l {\ + <5 g ), where 6 g is the density con- 
trast of galaxies. Since in this work we merely implement the 
linear alignment GI signal, which does not have any contribu- 
tion due to source clustering, we drop the tilde that marks the 
weighted intrinsic shear in the notation of HS04 to avoid confu- 
sion with Fourier transforms. 

The explicit form of both P y y and P Sy \ depend on the intri- 
cacies of galaxy formation and evolution within their dark matter 
environment, and are to date only poorly constrained from both 
theory and observatio ns (for a recent theoretical approach based 
on the halo model see lSchneider & Bri dle 2009). Thus, it is cur- 
rently impossible to model these systematics with the necessary 
accuracy to precisely measure cosmological parameters by cos- 
mic shear without risking a severe bias. 

Consequently, one has to rely on geometrical methods to re- 
move the intrinsic alignment systematics. The II signal stems 
from pairs of galaxies that are physically close, i.e. close both 
on the sky and in (spectroscopic) redshift. As long as the red- 
shift distributions of galaxies are relatively concentrated, one can 
thus eliminate the II correlations by re moving pairs of galaxies 
close in photometric redshif t estimates (tKing & SchneiderF 2002: 
iHevmans & Heaven s 2003), as is also evident from the weight- 
ing in the integrand of Q. Takada & White] (120041) have shown 
that excluding the auto-correlations from the analysis increases 
statistical errors only moderately by about 10% when using at 
least five redshift slices. We follow this approach by excluding 
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auto-correlations from our investigations. A more sophisticated 
downweighting scheme of the II signal in presence of tomog- 
raphy cosmic shear data can be readily incorporated into the 
nulling technique. Hence, we are going to neglect the contam- 
ination by the II signal in what follows. However, as we will 
also deal with cases of large photometric errors, an II signal is 
expected to be present in cross-correlations of different redshift 
distributions. This limits the validity of dropping the II signal, as 
will be assessed in Sect. 13.31 

To eliminate the GI contamination, we null all contributions 
to the lensing signal from matter, located at the redshift of the 
galaxies in distribution i, i.e. the distribution with lower median 
redshift. The derivation of the nulling technique is based on the 
assumption of narrow photometric redshift bins, so that we write 



(11) 



where x(Zi) is the comoving distance corresponding to an ap- 
propriately chosen redshift z, within distribution i. As a con- 
sequence, the lensing efficiency (0]l simplifies to g®(x) ~ 1 - 
X/x(zd for x ^ X(zd an( l e l se - Introducing a weight function 
B ( ^(x), one can define a modified lensing efficiency via 



•Jy 



dx' B ( V) 1 

\ X 



X 



which constitutes a weighted integral over the approximated 
lensing efficiency. The lower integration limit was changed from 
to x because the lensing efficiency in the integrand vanishes 
for^-' < x> see above. The weight function is constrained by the 
equation 



fXho 

? & (x(zd) = 



dx' B ( V) 11- 



x(Zi) 
X' 



= 0, 



(13) 



meaning that if the background lensing efficiency g ( '\x) in © is 
replaced by (TfZt . the contribution of matter at x(zd to the lensing 
signal of the background population j is nulled, as desired. 

Equation ( Tf3l l only ensures that the contribution to the lens- 
ing signal is eliminated exactly at x(j£ ;)> t> ut since the lensing ef- 
ficiency is a smooth function of x, the contributions from neigh- 
boring distances will also be largely downweighted. Therefore, 
one does not expect a perfect removal, but a substantial sup- 
pression of the GI signal due to nulling, provided that the dis- 
tance probability distribution is sufficiently compact. In the still 
unconstrained range < ^ < x(zd> B (X) i s set to zero. 
Henceforth, we denote the distribution in which the signal is 
nulled, or equivalently, the photometric redshift bin this distri- 
bution corresponds to, by 'initial bin'. 

Assuming disjoint, narrow bins in redshift also for (f2]l by in- 
serting ( fTTt . one can define a tomography power spectrum, eval- 
uated at precisely known comoving distances, 



9H*Q. 2 m pw« / v 
PooiCxuXj) = <fr l-£ 

^ L Jmax(x„Xj) V A.' 



1 - — I (14) 

Xj) 



x {l +z(x)YP ss y-,x 



where Azj denotes the width of photometric redshift bins, and 
where x'(z) is the derivative of comoving distance with respect 
to redshift, which can be obtained analytically from (0. The sec- 
ond term in ( fl5l ) is the approximation of the foregoing integral by 
a Riemannian sum. It reflects the fact that information about the 
radial distance is available only in discrete, binned form, and in 
terms of redshift rather than comoving distance. Since the weight 
function B^\x) vanishes for x ^ x(zd, the sum starts only at 
bin i + 1 . We will use the discrete expression of (fl3T l throughout 
this work, including cases in which the photometric redshift bins 
are broad and overlapping. Transforming the constraint equation 
( TOl l to an integral over redshift, and discretizing analogously to 
(15[ . one arrives at 



N- 



X B«(x( Z j))x'(zj)Azj 1- 
i=i+\ \ 



X(zd 
XiZj) 



= 



(16) 



With these equations at hand we are able to demonstrate 
how this technique removes the GI signal. In practice, the power 
spectra n w (£) will not only be composed of the lensing power 
spectra as written in ( fT31 ), but of the observed signal = 
Pqq(0 + {€), where the latter term is unknown. Using ( ITTb 



(12) again, (O is modified as follows, 



P gi (0 ~ -^2- 



P<x®» ^ P 

XKZi) 



X(zd 



,x{W\ 



(17) 



i _ rt&n i±£i P 



X(zj)) xizd 



X(zd 



Xizi)] . 



where the approximation has been applied to distribution i in the 
first step and to distribution j in the second equality. The latter 
transformation only affects the lensing efficiency and is readily 
seen by inserting the approximated distance distribution into ©. 
Note that the second term in (O, containing g®(x) P (x)< van ~ 
ishes if the redshift distributions do not overlap. This does not 
hold anymore for more realistic, broader distributions, the con- 
sequences being discussed in Sect. 15. 31 Now assume that F£f (0, 
in the form as given in the second equality of (fTTJl, adds to the 
lensing signal. Computing the nulled power spectrum Yl^\£) ac- 
cording to the discrete form of (fT3T l, one readily finds that this 
new power spectrum does not have a GI contamination anymore 
if C03 is fulfilled. 

For the sake of a compact notation we define the vectors 



T d) _ 
[0] - 



I [0] 

I I [0]l 

T (i) = [1] 

[i] I 



with T 



>(0 
[0], 



= 1- 



with 



X(Zi) 
X(Zj) 



B w {x(Zj))x'(Zj)Azj 



(18) 



so that the constraint ( fl6] l turns into an orthogonality relation, 
(t|q] ■ r|j-|J = 0. We now compute more weights J® with q > 2 
in order to construct further new power spectra of 'order' q, 



According to the modification of the lensing efficiency ( fT2l . ^[^00 _ ^ [?]; ^tat CO 
JS08 have introduced new power spectra of the form i= M 



(19) 



n w (^) = d x ' fi (! V) PGG({;x(zd,x') 

Jo 

- £ B«(x(zj))P ( £({)x'(Zj)Azj, 



(15) 



j=i+\ 



where the weights are specified by the requirement 

W]' T [S) = fora11 0<r<^. (20) 

In the discretized version given by ( fT6l l the weight function has 
N z —i free parameters, namely the function values B^'HxiZj))- For 



B. Joachimi and P. Schneider: The removal of shear-ellipticity correlations from the cosmic shear signal 



5 



fixed initial bin i these free parameters translate into the N : - i- 
dimensional vectors rj^ . Since (fT6l i does not restrict the overall 
amplitude, we fix the normalization by assigning unit length to 
the vectors yj . j . In total, one can thus construct N z - i new power 
spectra per bin i, but since the additional constraint (IT6b reduces 
the degrees of freedom by one, one new power spectrum cannot 
be freed from the GI contamination. It is the zeroth-order power 
spectrum, also constructed via ( fT9] > for q — 0, which obviously 
cannot fulfill the nulling constraint. 

By defining vectors that contain the cosmic shear observ- 
ables, i.e. in our case the power spectra, 



n®(0 = (ng,(o, ... ,n« _ ; ._ 1]W } T 

and composing the transformation matrix 

~ \ x [0] [Nz-i-1]) 



(21) 



(22) 



for every distribution i and angular frequency I, the new power 
spectra are given by n (,) (£) = T (l) P (<) {€) . Due to the construction 
of the weights the transformation matrix is orthogonal, and 
so is the transformation of the full data set. Therefore the nulling 
technique can be interpreted as a rotation of the cosmic shear 
data vector such that in the rotated set the GI contamination is 
restricted to certain elements, namely those with a subscript [0]. 
By removing these, one loses part of the lensing signal and hence 
statistical power, but eliminates the GI systematic within the lim- 
its of the approximations made in the foregoing derivation. 

Performing a rotation, the dimension of the nulled data vec- 
tor, which is composed of the II®(£) for every i and €, is exactly 
the same as for the original data set. For the data analysis one re- 
moves the contaminated nulled power spectra with subscript [0], 
i.e. one entry per initial bin. This is the step that actually does the 
nulling and modifies both statistical and systematic error bud- 
gets. In this work, we are going to use all remaining nulled power 
spectra with q > 1 throughout. Since they are merely specified 
by being composed of mutually orthogonal weights, there is no 
ordering among different q. In particular, it is impossible to make 
a priori statements about the information content of different or- 
ders q. 

It should be noted, however, that one can combine the for- 
malism outlined above with a data compression algorithm, based 
on Fisher information. As investigated in JS08, nearly all infor- 
mation about cosmological parameters can be concentrated in a 
limited set of nulled power spectra, constructed from the first- 
order weights rly. The additional requirement that a suitable 
combination of Fisher matrix elements is to be maximized in- 
troduces a strong hierarchy in terms of information content into 
the sequence of YiS'Xt) with q > 1. We will not consider such an 
optimization in this work. 

2.2. Fisher matrix formalism 

In the followin g analysis we will m ake use of the Fisher matrix 
formalism (see Tegmark et al. 1997 for details) to obtain param- 
eter constraints. Probing the likelihood locally around its maxi- 
mum, it is computationally much cheaper than a full likelihood 
analysis and thus useful for error estimates for a large set of mod- 
els. The elements of the Fisher matrix are defined by 



F flv = -\ 



3 2 InL 

dp u dp y 



(23) 



for a set of parameters p, where L denotes the likeli- 
hood. In this paper the set of cosmological parameters 
{Q m , erg, /ijoo, n s , Clb, wo, Wo) is considered, see Sect. l3.2l for fur- 
ther details. 

To second-order Taylor expansion around the maximum 
likelihood point the likelihood can be described by a multivari- 
ate Gaussian, so that, as long as only regions in parameter space 
are probed where the non-Gaussian contributions are negligible, 
it is sufficient to consider a Gaussian likelihood 



L x (x\p) 



1 



x exp\--[x- x{p)] T C x (p) 1 [x - x(p)] 



(24) 



for a data vector x with expectation value x(p) and covari- 
ance C x (p), w h ere N d is the dimension of the full data vector. 
Tegmark e t al.1 (1 19971) have shown that for this case the Fisher 
matrix reads 



p - 

1 iiv — 




dC x 
dfi 



C x 



dC x 



^ _j / dx dx T + dx dx . , ^ , : _ ^ 
x \ dn dv dv d/i 



where the argument of x and C x has been omitted for conve- 
nience. 

Now consider an invertible linear transformation T of the 
data vector, 



y = Tx ; C y = TC X T T . 



(26) 



In this work, x corresponds to the data vector P (l> ({), and y to 
the nulled data vector II w (f), while the transformation is given 
by (T% . Plugging the relations (|26| | into (l24l . one finds that the 
exponential remains unchanged, while the prefactor gets an ad- 
ditional term | det r| _I , using det(rC v r r ) = detC^det 2 7. This 
modification merely leads to a rescaling of the likelihood val- 
ues, and thus likelihood contours in parameter space remain un- 
changed. Since T is invertible, the data in x and y contains the 
same amount of information about the parameters. Accordingly, 
the Fisher matrix is also invariant under this transformation 
(Teg mark et al.|[T997l) . which is easily demonstrated by inserting 
(l26l l into d25l >. 

However, in the case of nulling the transformation (fT9] l to the 
new data vector II w (f) depends on the cosmological parameters 
one aims at determining because the elements of T are composed 
of comoving distances. Hence, the likelihood is now parameter- 
dependent in both arguments, 



L y (y\p) = (detrO))" 1 L x (x\p) 



(27) 



where we omitted the modulus of det T as this expression can 
always be turned positive by swapping two entries of either the 
original or the transformed data vector. The prefactor in d27l > acts 
like a prior on the original likelihood of x. In JS08 an example 
of the magnitude of the effect of this prior was assessed uninten- 
tionally by not taking into account the prefactor although det T 
differed from unity due to a different normalization. As stated 
in JS08, however, the likelihood values of both data sets were 
checked to be identical to the level of numerical accuracy. We 
conclude that the effect of the prior due to the data transforma- 
tion must have been considerably weaker than the one of the flat 
prior imposed in the analysis. As far as nulling is concerned, the 
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prior of (|27| > only acts on cosmological parameters that enter OJ 
in a non-trivial way. 

We intend to compute the Fisher matrix for the original and 
the transformed data set, in both cases at the point of maximum 
likelihood, i.e. for the fiducial set of parameters. At this point in 
parameter space we expect the derivative with respect to parame- 
ters to vanish on average, (dL/dp^) = 0. If the relation holds for 
L x (x\p), it is clear from d27l > that this is generally not the case for 
L y (y\p). Therefore we set the requirement that det T = 1, which 
is fulfilled by the orthogonal transformation constructed in the 
foregoing section. Then one can show that the Fisher matrices of 
both data vectors are equivalent, even for a parameter-dependent 
data transformation, as is detailed in App.lAl 

Furthermore, we assume that the original covariance C x does 
not depend on cosmological parameters. Since an additional cos- 
mology dependence would lead t o tighter constrain ts, this is a 
conservative assumption (see e.g. Eifl er et al.ll2008l) . Using the 
equivalence of the Fisher matrices, and returning to the notation 
in the context of the nulling technique, we then arrive from (l25T l 
at the following expression for the original (index 'orig') and the 
nulled (index 'null') data vector (see App.lAli. 



Y 1 dP G Ga ( r -i\ dPGGp 

^ d PlJ r p U 



a, 0=1 
K< 



- z 



a,/3, y,6=l 



dPcCj 



dp v 

( r -\\ T dP G GS _ pnull 

i Cn U Tps ~dp~ = F ^ 



(28) 



where Pqq and T are the lensing power spectrum data vec- 
tor and the nulling transformation matrix of the full data set, 
respectively. The data vectors of the full set have the dimen- 
sion A^d = NfN z (N z — 1) /2 if Nt angular frequency bins are 
considered. The covariance matrices of the original and nulled 
power spectra are denoted by Cp and Cn- The equality of orig- 
inal and nulled Fisher matrix, i.e. the Fisher matrix after per- 
forming the nulling rotation, directly follows from d26l >. second 
equation. However, the actual nulling step removes elements 
from the transformed data vector, thereby reducing the dimen- 
sion of the nulled data vector to N( (N z - 1) (N z - 2) /2 and caus- 



^null,red 



< F, lv , where F, 



null, red 



denotes the Fisher matrix, 



ing t , v - 1 ! vvn^x^ -t , v 

computed from the nulled data vector after the removal of the 
contaminated power spectra with q = 0. 

Since the inverse Fisher matrix is an estimate for the param- 
eter covariance matrix, we compute the marginalized statistical 
errors as crip,) = ^(F^ 1 )^. Due to the Cramer-Rao inequal- 
ity this is a lower bound on the error. To assess the effect of 
the systematic, we also calcul ate the bias on every parameter by 
means of the bias formalism dKim et alj2004t[Hu terer & Takada 
20051: iHuterer et alj2006tlTavlor et al.l2007HAmara & Refregiei 
2008; Kit ching et alJl2008h . Assuming a systematic Pgi that is 
subdominant with respect to the signal and causes only small 
systematic errors, the bias bona parameter p, can be calculated 
by 



v a, j3=l 



dp 



GG/3 



■P dp v 



(29) 



and likewise for the nulled data set. A formal derivation of the 
bias formalism, including the discussion of its limitations can be 
found in App.lBl 



3. Modeling 

3.1. Redshift distributions 

To model realistic redshift probability distributions of galaxies 
in the presence of photo metric redshift erro rs, we keep close to 
the fo rmalisms used in lMa et all J2006) and lAmara & Refr egier 
d2007l) . We assume survey parameters that should be representa- 
tive of any future space-based mission aimed at precision mea- 
surements of cosmic shear, such as the Euclid satellite proposed 
to ESA. Note that the probability distributions of comoving dis- 
tances and redshift, used in parallel in this work, are related via 

Pz(z) = PxOdx'izl . 

According to lSmail et alj ( 1 19941) we assume an overall red- 
shift probability distribution 



Aot(z) oc U-\ exp 



(30) 



with ft — 1.5. To get a median redshift of Zmed = 0.9, we choose 
Zo = 0.64. The distribution is cut at z max = 3 and then normalized 
to unity. The total distribution of galaxies per unit survey area is 
then n tot (z) - n p t ot(z), where n is the total number density of 
galaxies. The choice of photometric redshift bin boundaries for 
the tomography is in principle arbitrary. Here, we divide p tot {z) 
into N z photometric redshift bins such that every bin contains the 
same number of galaxies, i.e. 



dz Aot(z) = jj- for every i 



1, 



,N 7 



(31) 



where the Zi mark the redshifts of the bin boundaries, and where 
Zo = and zn- = Zmax- This choice of binning is solely for com- 
putational convenience and to allow for easy comparisons of se- 
tups with a different number of bins. The nulling technique as 
such does not rely on any particular choice of photometric red- 
shift binning. 

Our model for photometric redshift errors accounts for two 
effects, a statistical uncertainty characterized by the redshift dis- 
persion <x p h(l + z), and misidentifications of a fraction / cat of 
galaxies with offsets from the center of the distribution of ±A Z . 
We write the conditional probability of obtaining a photometric 
redshift z pn given the true, spectroscopic redshift z as 

P(z v h I Z) oc (1 - G (z ph ; Z, cr ph (1 + z)) + ^ 

x {G(z ph ; z+, cr ph (l +z + )) + G(z ph ; z_, o- ph (l +z_))} , (32) 

where G (z ? h', z, crj is a Gaussian with mean z and dispersion 
<x, and where z+ = z + A, and z- = z - A z . When integrat- 
ing ( l32b over z p h with infinite range, it yields unity for every z. 
However, since we consider a finite redshift range, the distribu- 
tions corresponding to the lowest and highest photometric red- 
shift bins and those with significant outlier population will be 
cut at and z max , so that we normalize p(z ? h I z) by demanding 
j^ m " dz ph p(z p h I z) = 1 for every z. Multiplying /?(z ph | z) with 
the overall redshift probability distribution of galaxies ptot(z) 
yields the two-dimensional probability of obtaining a pair of 
redshift measurements {z p h,z}- When integrating this probabil- 
ity over photometric redshift within the bin boundaries defined 
above, one arrives at the true probability distribution of galaxies 
for every photometric redshift bin i, 



P w (z) 



Ptot(z) dz P h p(z P hlz) 
J^ m " dz' p to t(z') C' dz p h p(z p h I z') 



(33) 
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Fig. 1. Number density distribution of galaxies for a division into 
N t = 5 redshift bins, rendered dimensionless through dividing 
by the total number density n. The thick solid line corresponds 
to the overall galaxy number density distribution, normalized to 
unity. The thin curves represent the distributions correspond- 
ing to the five photometric redshift bins, normalized to l/N z . 
The original bin boundaries are chosen according to OTI ). Note 
that the sum of the individual distributions adds up to the to- 
tal distribution for every z. Top panel: Resulting distributions for 
cr ph = 0.05 and no catastrophic outliers. Bottom panel: Resulting 
distributions for cr ph = 0.05, / cat = 0.1, and A- = 1.0. 

Due to the multiplication by ptotiz) these distributions are limited 
to the interval [0,z max ] although (l32l i is non-vanishing outside 
that range. To ensure that the dispersions of the Gaussians in 
(l32l > are positive, A z < 1 is required. In this work we set A, 
I fixed since this choice produces outlier distributions that are 
well separated from the central peak, as also found in realistic 
situations, see below. 

The number density of galaxies located in photometric red- 
shift bin i as a function of spectroscopic redshift is given by 

«®(z) = «to,(z) £ ' dz ph p( Zph | Z) , (34) 

so that evidently 2; n w (z) = n tot (z) for every redshift z. Using 
this last equation and multiplying (IBTb by n, one sees that the 
sum of the number densities of galaxies, having their true red- 
shifts between the bin boundaries defined by (1311 1. is the same 
for all bins, namely n/N z , as requested. However, the number 
densities of galaxies per photometric redshift bin, i.e. n w = 
J m " x dz « w (z), are generally not identical. The photometric red- 
snift errors lead to a redistribution of galaxies, which will in 
our model cause the outermost galaxy distributions to contain 
slightly more objects than n/N z . 

Two examples for galaxy distributions n (,) (z) obtained via 
this formalism are shown in Fig.Q] one without outliers and with 




0.00 0.02 0.04 0.06 0.08 0.10 
f cat 

Fig. 2. Relation between / cat and the true fraction of outliers in 
the redshift distributions r ollt . The gray area marks the range of 
possible values of r out if <x ph lies in the interval [0.01; 0.1], where 
<T ph = 0.01 produces the upper limit and cr ph =0.1 the lower 
limit of the gray region. A one-to-one relation is indicated by 
the solid black line. 

a dispersion of <x ph = 0.05, and one where outliers with / cat = 0.1 
at an offset A- = 1 have been added. As is evident from the plot 
in the lower panel, the outlier Gaussians are modified by ( 1331 
into elongated bumps, which are well separated from the central 
peak. They are most prominent as a distribution with z > 1, be- 
ing part of the lowest photometric bin, and a broad distribution at 
low redshifts, belonging to the highest photometric bin. This be- 
havior is qualitatively in good agreement with the characteristic 
shape of the scatter plots in the spectroscopic re dshift - photo- 
metric redshift plane, as for instance analyzed in Abdalla et al. 
(120071) . which also justifies our choice of A z = 1 . 

To judge the performance of nulling in the presence of catas- 
trophic outliers in the redshift distributions, it is important to 
note that / cat does not equal the true fraction of outliers, primar- 
ily because of the subsequent multiplication of (l32l l by the over- 
all redshift distribution pt t(z), see J33t . We compute the true 
fraction of outliers, denoted by r out , as the part of a redshift dis- 
tribution that is contained in the two outlier Gaussians of our 
model. A quantity p ca t(Zph I z) is defined identically to Q21 l. but 
with the first term, i.e. the central Gaussian, removed. Then we 
define the outlier fraction as 

1 VI X m " ^ ^ tot( ^ Iz ' dZ P h /^at^ph I $ 

r ut = — ) . —fi— -75 , (35) 

z £l Jo dz Pwfe) J ZM dz P h P( Z V* I $ 

where r out is averaged over all photometric redshift bins. 

In Fig.[2]the relation between r out and / cat for fixed A- = 1.0 
is plotted. The gray region comprises the results for the range 
from <r ph = 0.01 to <x ph = 0.1. Evidently, the true fraction of out- 
liers is smaller than / cat , reaching up to about 6 % for / cat < 0.1. 
The strongest contribution to r out originates from the bins at 
the lowest and highest redshifts, where the outlier distributions 
are enhanced because one of the outlier Gaussians is located 
in a redshift regime where ptot(z) obtains high values. The red- 
shift distributions centered at medium redshifts have their central 
Gaussian at z ~ 1 where ptoiiz) peaks, so that the outlier fraction 
in the corresponding bins is small. 

In the following, we will consider the range < / cat < 0.1, 
which yields outlier fractions that should comprise realistic lim- 
its of catastrophic failures in the photometric redshift determi- 
nation of surveys aimed at measuring cosmic shear tomogra- 
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Dhvfsee lAbdalla et all2007h . For the COSMOS field lllbert et all 
(l2009h found photometric redshift dispersions in the range be- 
tween 0.007 for the brightest galaxies and 0.06 for fainter ob- 
jects up z ~ 2. Taking these values as a reference, we are going 
to consider the range < cr p h < 0. 1 . 

3.2. Lensing power spectra 

As the basis for our analysis we use sets of tomography lens- 
ing power spectra which are computed for a ACDM universe 
with fiducial parameters £2 m = 0.25, £2de,o = 0.75, and Hq = 
100/iiookm/s/Mpc with /iioo — 0.7. Throughout, the spatial ge- 
ometry of the Universe is assumed to be fiat. We incorporate a 
variable dark energy scenario by parametrizing its equation of 
state, relating pressure pde to density pde, as 



Pde = [wo + Waj^j Pdec 2 , 



(36) 



where the cosmological constant is chosen as the fiducial model, 
i.e. wo = - 1 and w a = 0. Then the dark energy density parameter 
reads 



Q 



, DE (z) = £2de,o exp3 [w a y-^— - (w + w a + 1) ln(l + z) j . (37) 

The three-dimensional power spectrum of matter density fluctu- 
ations Pgg is further specified by the primordial slope n s — 1, 
the normalization er g = 0.9 and the shape parameter T, cal- 
culated according to lSugivamal (1 19951) with Clu = 0.05. Using 
the transfer function of lEisenstein & Hul dl998l) (without bary- 
onic wiggles), the non-linea r power spectrum is com puted by 
means of the fit formula of Peac ock & Doddsl (Il996l) . The to- 
mography power spectra are then determined via d2), incorpo- 
rating the photometric redshift models of the foregoing section, 
forNe = 100 logarithmic angular frequency bins between I — 10 
and I = 2 • 10 4 . 

The nulled power spectra Il[ are then calculated via 

( PT9l i. The nulling weights T^, see (fl~8b . are computed for the 
fiducial cosmology, while the higher orders are obtained by 
Gram-Schmidt ortho-normalization. The Gram-Schmidt proce- 
dure does not uniquely define the order of the orthogonal vec- 
tors, so that no particular ordering is assigned to q, as opposed 
to the approach in JS08, where a higher order q corresponded to 
a lower information content in ITj^(^). 

On applying nulling to a real data set, one has to assume the 
values of the relevant parameters D. m , Qde, Wq, and w a to obtain 
rjoj . Whilst it is a realistic premise that these parameters are ap- 
proximately known, slightly incorrect assumptions may degrade 
the downweighting of the GI signal, but do not introduce a new 
bias to the parameter estimation, as will be assessed in detail 
in Sect. 14.21 A sample of both original and nulled tomography 
power spectra are plotted in Fig. [3] For this sample the nulling 
has been performed following variant (C), which will be dis- 
cussed in detail in Sect. 14. II 

A s regards the calcu lation of the power spectrum covari- 
ance (Joac himi et ai1l2008L and references therein), entering the 
Fisher matrix, we have to specify further survey characteristics 
in addition to the aforementioned redshift probability distribu- 
tion. We assume a survey size of 20, 000 deg 2 and a total num- 
ber density of galaxies of n — 35 arcmin~ 2 , resulting in approx- 
imately 35/N z arcmin~ 2 galaxies per photometric redshift bin. 
To compute shot noise, the dispersion of intrinsic ellipticities is 
set to <r e = 0.35. These survey parameters correspond to those 
representative of future cosmic shear satellite missions such as 
Euclid. 



3.3. Intrinsic alignment signal 

To quantify the bias on cosmological parameters before and af- 
ter nulling, a GI systematic power spectrum is added to the 
da ta vector. We a d opt th e 'non-linear linear alignment model' 
of iBridle & Ki ng ( 2007), who suggest to compute the three- 
dimensional matter-intrinsic shear cross-power spectrum as 



(k,z) = -C G i p c 



Qm(l+Z) 2 

D(z) 



Pss(k,z) 



(38) 



where p cr is the critical density, and where D(z) denotes the 
growth factor, normalized to unity for z = 0. The constant 
Cqi has units of inverse density and was d etermined by HS0 4 
through comp arison with SuperC OSMOS dBrown et alj|2002l) : 
according to lBridle & King] d2007l) . we set C Gl p CI * 0.0134. The 
corresponding II power spectrum reads 



C 2 c? 

'-GI Pci 



^(1+z) 4 
DHz) 



Pss(k,z) 



(39) 



Originating from analytical considerations by HS04, the linear 
alignment model in the form employed here lacks solid physical 
motiv ation, but fits within the error bars of iMandelbaum et al.l 
(2006). It also provides reasonable fits to the result s of the halo 
model considerations by [Schneider & Bridle! d2009l) . 

While the nulling technique as such is completely indepen- 
dent of the actual functional form of the systematic, the residual 
bias does depend on the GI signal. Thus, we consider an addi- 
tional set of simplistic power-law GI power spectra for reference. 
They are given by 



iGI 



k 

k re f 



s-2 



(1+z) 



(40) 



where k K f = 1/zioo/Mpc. As is evident from d29t , the pro- 
duced bias is simply proportional to the amplitude of the sys- 
tematic, so that we do not need to investigate variations of the 
overall magnitude of the GI term. Hence, we relate the nor- 
malization of ( 1401 to the linear alignment model ( 1381 . and set 
Agi = |P^i(£ref,Zmed)l(l + Zmed)" 1 - For the power law slope we 
use the values s = {0.1, 0.4, 0.7), where the central value best re- 
produces the average slope of the linear alignment model power 
spectra. The tomography power spectra are then obtained via ©. 

The resulting pow er spectra are a lso sh own in Fig. [3] As 
already mentioned in IBridle & Kind d2007l) . the linear align- 
ment model produces a strong systematic, partially surpassing 
the lensing signal in amplitude for cross-correlations of largely 
different redshift bins. Since the GI term is negative, the sum 
of lensing and intrinsic alignment power spectrum can become 
negative in the corresponding Grange in these casefl Due to 
our choice of normalization, the power-law toy GI signal can 
dominate the lensing power spectrum on even larger angular fre- 
quency intervals. 

After nulling, the systematic is largely suppressed, oscil- 
lating around zero for the lower redshift bins. Still, significant 
residual signals remain because the finite extent of the redshift 
probability distributions has been neglected in the derivation of 
nulling. In particular, the systematic signal is eliminated only at a 
single redshift within each bin, thus being merely downweighted 
in neighboring redshift ranges. A detailed discussion about the 
sources of the residual bias will follow in Sect. [5] We note that 
nulling works independently of the strength of the systematic; it 



1 Note however that the total power spectrum of auto-correlations of 
ellipticities, i.e. GG+GI+II, always has to be positive by definition. 
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Fig. 3. Original and nulled tomography power spectra as a function of angular frequency. The survey has been divided into N z — 10 
photometric redshift bins with dispersion 0.03(1 + z). Top right panels: Lensing power spectra Pq£(€) are shown as solid lines. The 

modulus of linear alignment model GI power spectra P^(£) is given by dashed lines, the corresponding II signal by gray curves. 
In each panel the redshift bins i and j are plotted. In the panels with the combinations i, j e {1,9} the absolute values of the power 
law GI models have been added for reference as dotted curves. Note that the II power spectrum becomes very small if i and j are 
largely different. Bottom left panels: The absolute values of the nulled lensing and linear alignment model systematic power spectra 
are shown as solid (GG), dashed (GI), and gray (II) curves, respectively. In each panel the corresponding redshift bin i and the order 
q are given. The nulled measures do not have a particular ordering in q, see text for details. For the lower redshift bins the GI signal 
is oscillating around zero. The II signal becomes very small for higher orders q. 



can even be applied to data in which the GI term surpasses the 
cosmic shear signal. 

We have also added II power spectra to Fig. [3] in order to 
judge in how far our assumption of dropping the II signal in 
our considerations is valid. The original II power spectra yield 
a strong contribution for auto-correlations, but drop off quickly 
if the correlated redshift distributions have less overlap. In the 
transformed data set, the II contamination is smaller than the 
residual GI signal and thus negligible for power spectra with 
q > 1 . For q = 1 however, the II signal is significant such that 
in this case nulling would have to be preceded by an II removal 
technique. In the limit of completely disjoint photometric bins, 
the II signal would be confined to auto-correlations in the origi- 



nal data set. Since these are not included into the construction of 
the nulled power spectra, the latter would be completely free of 
II terms in this idealized case. 

To ensure that the II term remains sufficiently small com- 
pared to the GG signal, one could restrict the subsequent anal- 
ysis partly to larger angular scales. For instance, to achieve a 
minimum suppression by a factor s of the II signal with respect 
to the lensing signal, we determine maximum allowed ^-values, 
given in Table Q] These upper bounds would only have to be ap- 
plied to orders q = 1, and are valid in the case of the setup used 
to produce Fig. [3] The limitations due to the II contamination are 
expected to become more restrictive as the photometric redshift 
scatter increases. 
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Table 1. Upper limits on the allowed angular frequency range 
if the II contamination in the nulled data shall be suppressed by 
at least a factor of s with respect to the nulled GG term. These 
limitations apply only for orders q = 1, and only if nulling is 
not preceded by a suitable II removal technique, as we advocate. 
The parameters are the same as in Fig. [3] Note that in a narrow 
range around t ~ 100 the II signal can be close to or slightly 
above the limit imposed by s. 



initial bin i 


.5=3 


.5=5 


1 


1170 


20 


2 


3420 


1470 


3 


5420 


2330 


4 


7960 


3170 


5 


11680 


4310 


6 


none 


5860 


7 


none 


7960 


8 


none 


13620 



Alternatively, our findings suggest that, due to the confine- 
ment of the II term to a limited set of nulled power spec- 
tra, a treatment of the II signal after nulling may also provide 
a promising ansatz. In the current implementation the nulled 
power spectra of order q — 1 have a dominating contribution 
from original power spectra P ,J (€) with 7 = 1+1, which contain 
the bulk of the II signal after the removal of auto-correlations 
from the analysis. Hence, the residual II terms accumulate within 
the measures of order q — 1 . The freedom to choose the weights 
of ( fl9l l in the subspace orthogonal to 7^ allows for a more spe- 
cific treatment of the II signal in the nulled data. We emphasize 
that the final goal is a simultaneous removal of all intrinsic align- 
ment contributions, but this is beyond the scope of this paper and 
subject to future work. 

As the GI contamination has a large amplitude, the question 
is raised whether the bias formalism, i.e. d29l l, still yields accu- 
rate results. The effect of a large systematic is investigated in 
detail in App.[B] We conclude from our findings that even for a 
strong GI term the bias is obtained with good accuracy whereas 
the statistical errors, which are also affected by a strong system- 
atic, can deviate more significantly. To guarantee results that are 
as close as possible to a full likelihood analysis, we downscale 
all GI signals by a factor of five throughout the subsequent sec- 
tions. Since the bias is proportional to the overall amplitude of 
the systematic, and since we are mostly going to consider ratios 
of biases, the rescaling does not have an influence on the state- 
ments concerning the performance of nulling. Merely the mean 
square error, defined by 

0-totOv) = ^(r 2 ( Pll ) + b\ Pll ) , (41) 

is affected because the systematic error becomes less dominant. 
A lower systematic amplitude slightly disfavors nulling as it 
lowers the bias while causing an increase in statistical errors. 
Besides, limiting the strength of biases avoids unphysical param- 
eter estimates as for instance Q m < 0. Such effects are normally 
avoided by priors, which have not been included in our Fisher 
matrix analysis though. 

In surveys with a significant GI systematic, intrinsic ellip- 
ticity correlations are likely to affect parameter estimation, too. 
To r estrict our consideration s to the GI contamination, we fol- 
low Taka da & White! (|2004), excluding auto-correlations from 
both original and nulled data vectors, and assuming that the re- 
maining measures do not have an II signal. Note that due to the 
exclusion of auto-correlation power spectra the statistical errors 



Table 2. Overview on nulling variants considered. The variants 
differ by the redshifts assigned to the foreground and background 
photometric redshift bins, and by the form of the zeroth-order 
weight function. 

variant foreground background th order weights 

(A) bin center lower boundary 1 -x(2i)/x(Zj) 

(B) bin center bin center 1 -x(2i)/x(Zj) 

(C) median redshift bin center g^' (x(zd) 



on cosmological parameters in this work are larger than those of 
other cosmic shear tomography analyses, even for our original 
data sets. 

Excluding auto-correlations is of limited accuracy to con- 
trol the II signal since we use a relatively dense binning, par- 
tially with large photometric errors, so that cross-correlations 
of adjacent photometric redshift bins would contain significant 
II terms as well. With realistic data one could in principle let 
the nulling be preceded by an II removal technique such as 
iKing & Schneider] d2002l) who also take a purely geometric ap- 
proach. However, the redshift-dependent weighting of galaxy 
pairs, on which the II removal is based, modifies the calcula- 
tion of the projected cosmic shear measures such as Q, which 
in turn entails a modification of the nulling weights. The im- 
provements of the nulling technique we investigate in Sect. 15.31 
will also constitute an efficient tool to control the II term. 



4. Improving the nulling performance 

4.1. Optimizing the nulling weights 

In the composition of the nulling weights ( fT8l ) one has the free- 
dom to choose the specific redshift z, within the initial bin at 
which the GI contribution is eliminated, as well as the referenc- 
ing of redshifts Zj to the background redshift bins. For conve- 
nience JS08 placed z,- at the center of the initial bin and identi- 
fied Zj with the lower boundary of bin j. Since this choice was 
fairly arbitrary, we seek to find a more appropriate referencing 
that leads to a minimum residual GI contamination. 

A more natural choice is to position both the redshift of the 
initial bin zi and the reference redshifts of the background bins 
at the center between the photometric redshift bin boundaries, 
denoted by z£ . This setup does not require knowledge about the 
redshift probability distribution of each bin, although this infor- 
mation has to be available at high precision for future cosmic 
shear surveys. Hence, we furthermore define nulling weights that 
take redshift information into account. Re-examining dlTi . one 
can drop the approximation of narrow redshift/distance prob- 
ability distributions for the background bins, keeping the first 
equality of ( TP7I ). Thereby, instead of the comoving distance ratio 
(l ~ x(zd/x(Zjf), one directly uses the lensing efficiency, which 
is the average of this ratio, weighted by the redshift/distance 
probability distribution of the background photometric redshift 
bin. The zeroth-order nulling weight in (TT~ST > is then given by 
VS\ = ^ F° r the remaining free redshift of the initial 

bin z.i we choose the median redshift of distribution i, a measure 
that contains information about the form of the distribution, but 
is robust against outliers. 

Hence, in total we are going to consider three different ver- 
sions of nulling: (A) the 'old' version of nulling with referenc- 
ing to the lower boundaries of the background bins, a variant 
(B) where the background bins are identified with the bin cen- 
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Fig. 4. Comparison of the per- 
formance of the different nulling 
weights. Shown are marginal- 
ized statistical errors <j in the 
top panels, relative systematic 
errors b le \ in the center pan- 
els, and mean square errors cr tot 
in the bottom panels. For the 
correspondence between consid- 
ered parameters and line col- 
ors/symbols see the legend. Left 
column: Change in errors from 
original to nulled data set, us- 
ing the referencing to bin bound- 
aries, i.e. variant (A). Right col- 
umn: Residual errors using the 
different nulling weights. (A) 
Referencing to bin boundaries; 

(B) Referencing to bin centers; 

(C) Nulling including detailed 
redshift information. 



original 



nulled 



a) 



b) 



c) 



ters instead, and (C) the nulling that includes detailed red- 
shift information via assigning the foreground bins to their me- 
dian redshifts and using the comoving distance ratio, weighted 
by p ( ^(x), as the zeroth-order nulling weight. The properties of 
these variants are summarized in Table [2] 

In Fig. [4] the performance of nulling with different nulling 
weights is shown. We plot the marginalized statistical error 
c(Pn) = y]{F~ l )nn an d the relative bias 

brz\(Pp) = Hpl*)/o~orig(Pit) , (42) 

where cr orig denotes the statistical error before nulling, for every 
cosmological parameter. Note that if we referred the bias after 
nulling to the statistical error after nulling, the usual loss of in- 
formation due to nulling could cause a decrease in b/cr even if 
the GI contamination remained completely unmodified. With the 
definition d42l . b K \ is an unambiguous measure of the relative 
importance of systematic errors in the data. Moreover, the mean 
square error ( HTb is given in the figure. Here and in the follow- 
ing, the seven parameters p = {O m , <x 8 , hioo, n s , Q.\,, wo, w a ) are 
considered in the Fisher matrix analysis. The data set is com- 
posed of power spectra for N z = 10 bins without photometric 
redshift errors, where the systematic stems from the linear align- 
ment model, downscaled by a factor of five. 

The left column of Fig.|4]illustrates the change in errors due 
to nulling with the referencing used hitherto, i.e. variant (A). 
While the marginalized statistical errors increase by up to a fac- 
tor of about three for the weakly constrained dark energy param- 
eters, the bias drops from values of up to 17 cr to numbers that are 
of the same order of magnitude as the original statistical errors, 



i.e. b K \ w 1 . For parameters that were strongly biased this leads 
to a considerable decrease in the mean square error, but cr tot may 
also slightly increase if the systematic was subdominant already 
before nulling as is the case for the Hubble parameter. 



In the right column of Fig. [4] resulting errors for all three 
nulling variants are given. It is evident that the newly introduced 
versions (B) and (C) of nulling perform significantly better in re- 
moving the systematic. Variant (B) decreases the bias by at least 
a factor of three with respect to (A), reversing the sign of the 
bias for almost all parameters. This hints at using the reference 
redshifts of the nulling weights as free parameters to control the 
amount of bias allowed in the data, as will be further discussed 
in Sect. [8] Variant (C) nearly perfectly eliminates the GI contam- 
ination. Although the underlying data lacks photometric redshift 
errors, knowledge about the distributions p^'Xz) is still advanta- 
geous as e.g. the lowest and highest redshift bin are broad and 
largely asymmetric. Regarding statistical errors, the better a ver- 
sion is capable of removing the systematic, the less stringent pa- 
rameter constraints become. However, the improved bias reduc- 
tion clearly outweighs the marginal increase in statistical errors. 



In summary, we propose to henceforth use nulling with ref- 
erencing to the centers of photometric redshift bin divisions, i.e. 
variant (B), in absence of detailed information about redshift dis- 
tributions, and else version (C) which exploits this knowledge. 
Both approaches will be considered in the following analyses. 
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Fig. 5. Cosmology dependence of the nulling weights. The 
change in estimates for the cosmological parameters, entering 
the distance-redshift relation non-trivially, is plotted for different 
iteration steps. The estimates resulting from using variant (C) 
are shown as solid lines, those for variant (B) as dashed lines. 
Iteration corresponds to the initial values for the parameters, 
in this case the results of the analysis of the unmodified data set. 
For reference, the estimates obtained by using the true under- 
lying cosmology to compute the nulling weights are plotted as 
thin lines. The hatched regions around these lines signify the lcr 
error region. Note that variant (B) reaches an accuracy compati- 
ble to using the true cosmology already after one iteration while 
variant (C) takes two iterations. 



4.2. Cosmology -dependence of the nulling weights 

The nulling weights 7^ depend on those parameters of the 

cosmological model that enter the comoving distance in a non- 
trivial way, i.e. for our model assumptions £2 m , wo, and w a . Since 
only ratios of comoving distances enter the nulling weights, 
there is no dependence on /iioo which enters the prefactor of ([3]). 
If the relevant cosmological parameters chosen to compute the 
nulling weights are different from the true parameters of the data 
set, the performance of nulling may deteriorate. A grossly incor- 
rect choice of nulling weights could in principle affect the lens- 
ing signal more than the GI term, which could then even cause 
a larger bias on parameters in the transformed data than in the 
original one. 



Avoiding any a priori guesses of the true values of the rel- 
evant cosmological parameters, we explore the cosmology de- 
pendence of the nulling weights by taking the estimates from 
the analysis of the original data set as input cosmology for the 
computation of the 7® .. As we use the linear alignment model 

( l3~8l l. the estimates pb - Pf + b, where pf is the true parameter 
value and b is the bias, are far from the true values and beyond 
any decent a priori guess, so that this setup can be understood as 
a worst-case scenario. With the weights obtained this way, the 
nulled data can be analyzed, yielding another set of parameter 
estimates. This can then be taken as input for a refined set of 
nulling weights, thereby creating an iterative process which can 
be terminated when successive iterations yield stable parameter 
estimates. 

In Fig. [5] the results of this iteration process are shown for 
nulling variants (B) and (C), both showing a very similar be- 
havior. The parameter estimates for iteration correspond to the 
estimates of the analysis of the original data set. Given these 
largely incorrect input parameters, nulling is still able to reduce 
the bias due to intrinsic alignment to a level close to the one 
when using the true cosmology as input. Already after the first 
iteration step the residual bias is considerably smaller than the 
statistical errors. After at most two iterations, the results for the 
residual bias are indistinguishable from those with the correct 
input parameters. 

Hence, the dependence of the nulling weights on cosmol- 
ogy is only weak, being solely due to geometrical terms. 
Consequently, nulling is robust against an incorrect initial guess 
for cosmological parameters needed to compute the nulling 
weights. For a consistency check, the iterative procedure out- 
lined above can be performed on the data. In the remainder of 
this work we will use the true cosmology to calculate the nulling 
weights for reasons of simplicity. 



5. Influence of redshift information on nulling 

5.1. Redshift binning 

First, we investigate the performance of nulling as a function 
of the number of photometric redshift bins the survey is di- 
vided into. The larger N z , the better (fT6l l is an approximation 
of ( [T3l >. so that the GI removal is expected to work more effi- 
ciently. Furthermore, since nulling eliminates the contribution 
to the lensing signal of the background objects only at a single 
redshift, more concentrated redshift probability distributions are 
nulled more accurately, given an appropriately chosen redshift 
Zi within the initial bin. At the same time, less statistical infor- 
mation is lost because the entries of the transformed data vector, 
which are removed in the process of nulling, contain less inde- 
pendent information if the redshift distributions have a smaller 
spacing. 

In search for a single quantity that measures an overall power 
of a data set to constrain cosmological parameters we define the 
average statistical power as 

F = {det(/v)}^ , (43) 

where N p is the number of parameters considered, i.e. the di- 
mension of the Fisher matrix. This measure is motivated by the 
fact that the determinant of the Fisher matrix is inversely pro- 
portional to the volume of the N p -dimensional error ellipsoid in 
parameter space. If errors are not correlated, F 2 reduces to the 
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Fig. 6. Ratios rf and r/, as a function of the number of photo- 
metric redshift bins N z . Thin curves represent rf, thick curves 
rt,. Results for zero photometric redshift error are given as solid 
black lines; results for cr ph = 0.05 are plotted as dashed lines. 
For the case <x ph = 0.05, is also plotted without the gp-teim 
included in the calculation of the systematic, see the dot-dashed 
line. Since only the systematic signal is manipulated, the statis- 
tical signal in this case is still given by the dashed line. Dotted 
lines represent rf and rt, if correlations of adjacent bins, i.e. bin 
combinations (if) with 7 = 1+1, are excluded. Incorporating the 
downweighting scheme for correlations of adjacent bins intro- 
duced in Sect. l5.3l produces the gray solid curves. The two latter 
sets of curves were also obtained for cr p h — 0.05. Note that the 
black solid and the dot-dashed lines are very close to zero for 
N z > 10 and N z > 20, respectively. 



geometric mean of the inverse square errors. In addition, we in- 
troduce an average relative bias 



b = 



1 ™" 

- y 



1 " 



(44) 



which is the root mean square of the ratio of the systematic over 
the statistical error before nulling over all considered parameters. 
We refer to the performance of nulling via the ratios 



r F = 



fiiull 

F 

1 one 



rb = 



b lw ii 



(45) 



of F and b after ('null') and before ('orig') nulling, respectively. 
For a good performance of nulling, rp should tend to one, i.e. 
the nulled data constrains parameters as well as the original one, 
whereas rb tends to zero, which corresponds to a complete elim- 
ination of the systematic. 

Figure[6]shows results for the ratios rf and rb for different N z , 
both without photometric redshift errors and for cr ph = 0.05. In 
this section the linear alignment model is used as the systematic, 
downscaled by a factor of five. For five redshift bins F nu \\ is only 
about a third of F or i g , but rf rises, first strongly and then with 
an increasingly shallow slope for larger N z . This development 
is mostly based on the improving performance of nulling since 
for a cosmic shear tomography data set statistical errors only 
marg i nally decrease for > 5 (see e.g.lHulll999t ISimon et all 
l2004HMa et aHl200d lBridle & King 2007; JS08). 



Introducing a photometric redshift dispersion of cr ph = 0.05, 
one finds that, for small N z , rf increases in the same way as 
in the case without photometric redshift errors. As soon as the 
size of the redshift bins attains the same order as the width of 
the dispersion cr ph (l + z), less additional redshift information 
becomes available to constrain parameters. Since nulling, like 
other techniques that deal with the control of intrinsic alignments 
(e.g. lBridle & K ing 2007), requires more precise redshift infor- 
mation, the curve for rf levels off. 

Even for only five bins in redshift, nulling is capable of re- 
ducing the average bias b by more than 95 % for perfect red- 
shift information. For N z > 10, less than 1 % of the average 
bias remains. If a more realistic photometric redshift dispersion 
is present in the data, rb significantly degrades to approximately 
0.15 for N z = 5. For ten photometric redshift bins a minimum 
value of rb ~ 3.5 % is achieved before this ratio increases again 
for more bins, meaning that the treatment of the systematic wors- 
ens in spite of the improvement of redshift information due to 
the finer division of photometric redshifts. This apparent contra- 
diction requires a more thorough investigation and will be ad- 
dressed in Sect. 15.31 

5.2. Minimum information loss 

Given ideal spectroscopic redshift information, equivalent to 
considering the limit — > 00, it would be possible to precisely 
eliminate the GI contamination at a given redshift, see tflTi . so 
that rb tends to zero in absence of photometric redshift errors, as 
is indeed the case. However, the curves for rf in Fig. [6] appar- 
ently indicate that the full statistical information is not regained 
in this limit, i.e. rf does not tend to unity. We investigate this 
further by calculating rp out to larger N z , assuming a simplified 
model with infinitesimally narrow redshift bins, 

P {i \z) = 6 D (z - zd , (46) 

and a covariance that contains only shot noise. The resulting 
curve, shown in Fig. [7] increases slower than logarithmically as 
a function of N z , so that one can expect that indeed nulling in- 
evitably reduces the statistical power of a data set, even when 
spectroscopic redshifts would be available. 

To illustrate this effect, consider again the continuous, inte- 
gral version of d 1 8t > . still in the limit of perfect redshift informa- 
tion. Choosing the zeroth-order nulling weight proportional to 
1 -XilXh see 02K one can write the corresponding transformed 
power spectrum as 



f 
f 

•Jy, 



dXj 1 



dXj 1 
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where in order to arrive at the second equality, the lensing power 
spectrum for spectroscopic redshifts has been obtained by insert- 
ing (|46T ) into ©. Note that the upper limit in the integration over 
X changes from ^hor to Xi because the lensing efficiency, here 
written as 1 - xlXi' vanishes for^- > xu Rearranging the terms, 
one arrives at 

Ti m (i,xi) * r 

Jo 
rx 

with g{x) = 



dx 1 



dXj 



g(x){i+z(x)rPss \-,x) (48) 
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Fig. 7. Ratio rf as a function of the number of photometric red- 
shift bins N z . This result has been obtained by means of a simpli- 
fied Fisher matrix calculation, placing galaxies at fixed redshifts 
and neglecting cosmic variance in the covariance. For large N z 
the increase in r? is slower than logarithmic. 



see ©. However, it is produced by an overlap of the redshift 
distributions of foreground and background distributions, so that 
the gp-term can be controlled by removing or downweighting 
bin combinations with a large overlap in redshift, in particular 
adjacent photometric redshift bins. For instance, one can simply 
exclude power spectra for bins (if) with j = i + 1 from the analy- 
sis, which results in the dotted curves given in Fig. [6] Indeed the 
contamination by the gp-term is suppressed, producing merely 
a less significant increase in rt, for N z > 20, but the statistical 
power decreases dramatically due to the removal of all power 
spectra with j = i + 1 . 

To alleviate this effect, we propose to downweight adjacent 
redshift bin combinations. According to d20b . increasing an en- 
try in the zeroth-order nulling weight implies a lower value in 
the corresponding entries of the higher-order weights. Hence, a 
manipulation of the zeroth-order weights can be used to down- 
weight certain power spectra in the process of nulling. We intro- 
duce the following modified weights 



Comparing (l48b to (O, one finds that the term g(x) is formally 
equivalent to the lensing efficiency of the background distribu- 
tiorfl the term 1 - XilXi acting analogously to a distance prob- 
ability distribution of galaxies. Thus, this 'background distri- 
bution' of the transformed power spectrum is broad, extending 
from the position of the foreground bin at Xi to the maximum 
distance ^hoi - Since the zeroth-order nulled power spectra are re- 
moved from the data set, it is this integrated redshift information 
for all foreground bin positions Xi that is necessarily lost due to 
nulling. 

5.3. Intrinsic alignment contamination from adjacent bins 

The increase in rt, for large N z in the case <x ph = 0.05, as seen in 
Fig-El can be explained by inspecting (0. To produce a GI effect, 
the intrinsic alignment has to act on the foreground galaxy while 
the background galaxy is lensed. Hence, the GI signal should 
stem from the first term in ©, whereas the second term that 
contains g®(x) P (x) w ith ' < j vanishes if the redshift prob- 
ability distributions are disjoint, see (fTTl i. We refer to the latter 
expression as the gp-term hereafter. This term can yield a con- 
tribution to the systematic in case the distributions overlap such 
that the true position of a galaxy from the background popula- 
tion is in front of galaxies from the foreground distribution. The 
contribution to the GI signal by swapped galaxy positions is not 
accounted for by nulling and produces a residual systematic. 

To quantify the effect caused by the gp-term, we compute 
the average bias for the same model of the three-dimensional 
GI power spectrum, but now with the gp-term removed from 
©. The resulting ratio rt is plotted in Fig.|6]as well. While this 
curve shows a similar behavior than the one for the systematic 
with gp-term for N z < 10, it does not follow the turnaround and 
continues to decrease for larger N z down to values of r/, obtained 
for data without photometric redshift errors, as expected. Thus, 
the increase in rt, of the data with <x ph = 0.05 for N z > 10 can 
indeed be explained by the contamination due to the gp-term. 

The gp-term cannot be quantified in detail as it depends ex- 
plicitly on the form of the matter-intrinsic shear power spectrum, 

2 For perfect correspondence the lower limit of the integral over \j 
should be^- instead of Xu However, the nulling weight given as 1 —XilXj 
has to vanish for^ < Xu an d at the same time the outer integral ensures 
X < Xi- 



T'^j - mj T%. with (49) 

w ^ 1+exp f(^TTI?j)}- 



To motivate this choice, consider that for j » i one gets Wu ~ 1, 
so that in the regime where the gp-term is unimportant the origi- 
nal weights are reproduced. Moreover, w„ = 2, which is in agree- 
ment with the fact that the gp-term is equal to the first term in 
(O for auto-correlations (note however that auto-correlations are 
excluded from the analysis anyway). The width of the Gaussian 
in d49l is in principle arbitrary, but here conveniently chosen to 
scale with the width of the photometric redshift bins. 

Therefore, the tvy are expected to follow the redshift depen- 
dence of the gp-term, so that the higher-order nulling weights 
jfj?j with q > 1 efficiently downweight its contribution. Note 
that the modification of the nulling weights is done before nor- 
malization such that the vectors still have unit length. As 
an aside, the weighting scheme d49l would also contribute to the 
downweighting of contaminations by the II term. 

Applying this Gaussian weighting scheme to the nulling pro- 
cedure, one obtains the gray curves of Fig. [6] While for a small 
number of redshift bins rf is similar to the case where all power 
spectra except auto-correlations were used, the curve approaches 
the results for the case with power spectra of adjacent bins re- 
moved for large N z . This means that for small N z the overlap be- 
tween redshift bins is marginal, so that the weighting has only lit- 
tle effect, whereas for many bins power spectra with j = i + 1 are 
largely downweighted such that removing them produces sim- 
ilar results. The Gaussian weighting ensures that rt, < 5 % for 
all N z > 10. We will further consider the performance of this 
weighting scheme in Sect. 17.11 

The best binning in photometric redshifts in terms of nulling 
performance does not only depend on the number of bins N z , but 
to a certain extent also on the choice of bin boundaries. The op- 
timal positions of bin boundaries are determined by the detailed 
form of the relation between photometric and true, spectroscopic 
redshifts, which is specific to each survey and thus shall not be 
further assessed here. 
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Fig. 8. Top panel: Ratios rp and rt, as a function of photometric 
redshift dispersion cr ph . The nulling has been performed by us- 
ing variant (B), and the linear alignment model, downscaled by 
a factor of five, has been employed as systematic. Solid black 
curves correspond to rp while rt, for the linear alignment model 
as systematic is given as black dashed curve. The values of rt, 
for the same model, but with the gp-teim removed from the 
GI power spectrum calculation, is given as dot-dashed line. The 
gray curves show rt, for the GI power-law models, where the 
different gray-scales stand for different slopes s as given in the 
legend. Bottom panel: Same as above, but using nulling variant 
(C). 



6. Influence of photometric redshift uncertainty 

6.1. Photometric redshift errors 

This section deals with the dependence of nulling on the pho- 
tometric redshift dispersion <x ph , in absence of catastrophic out- 
liers. The number of photometric redshift bins is kept at N : = 10 
for the remainder of this work, mainly for computational rea- 
sons. Future cosmic shear surveys, relying on precise redshift 
information and a large number of galaxy detections, will allow 
for considerably more photometric redshift bins, which may be 
advantageous in terms of nulling, see the foregoing section. 

In Fig.[8]rf is plotted as a function of <x ph while in Fig. [9] up- 
per panel, the ratios of the marginalized statistical errors before 
and after nulling are given for the parameters Q m and cr% indi- 
vidually. The curves for the other cosmological parameters vary 
considerably in magnitude, but otherwise show the same char- 
acteristics as the ones depicted. The ratio rp decreases only very 
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Fig. 9. Performance of nulling as a function of photometric red- 
shift dispersion cr ph . The nulling has been done using variant (C), 
and the linear alignment model, downscaled by a factor of five, 
has been employed as systematic. Shown are the results for the 
parameters fi m as black curves, and for <x 8 as gray curves. Top 
panel: Ratio of the marginalized statistical errors after and be- 
fore nulling. Bottom panel: Relative bias b K \. Dotted curves cor- 
respond to b K \ before nulling; dashed curves to b K \ after nulling. 
The solid line marks values of b re \ for which the marginalized 
statistical errors equal the bias. Note the logarithmic scaling of 
the ordinate axis. 



weakly with increasing cr p h for both nulling variants (B) and (C), 
taking values between 0.44 and 0.48, because splitting the range 
of redshifts between and 3 into 10 photometric redshift bins 
does not lead to a significant degrading of redshift information, 
even for cr ph = 0.1. In contrast to this, the ratio of the marginal- 
ized errors of individual cosmological parameters does vary with 
cr p h, but changes are smaller than about 10 %. The statistical er- 
rors of both the original and the nulled data set increase for larger 
photometric redshift errors similarly, but the error of the nulled 
set starts to do so already at smaller <x p h, thereby producing a 
peak at cr ph ss 0.03 in both curves in Fig. [9] Marginalized er- 
rors for each of the seven considered parameters are a factor of 
roughly two to three larger for the nulled data. 

As is evident from Fig. [8] lower panel, nulling using variant 
(C) is capable of reducing the average bias caused by the linear 
alignment model by more than a factor of 50 for <x ph < 0.04. 
Looking at the effect on the bias of individual parameters in 
Fig-El lower panel, one sees that the systematic is suppressed 
by more than 2 orders of magnitude for small <x p h. In spite of the 
strong intrinsic alignment signal, the bias is kept subdominant 
up to <Tph ~ 0.05. The drop in rt, at cr p h ~ 0.03 is also visible 
in Fig.! and can be traced back to a sign change in the residual 
bias for several parameters, among them Q m and <x 8 . 

For larger redshift dispersions, rt, shows an approximately 
linear increase, which can only partially be ascribed to the con- 
tamination by the g/?-term as can be concluded from comparing 
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with the curve for the linear alignment model without gp-term. 
The rise in rt, is caused by two effects that are visible in Fig. [9] 
First, the strong relative bias in Q m and cr 8 for the original data 
set starts to slowly decrease for <x p h Z 0.02, predominantly be- 
cause the statistical errors rise due to the degrading information 
content in the line-of-sight direction. Second, the residual bias 
after nulling increases as a function of cr ph and starts to attain 
values of the same order as the statistical errors, i.e. \b K i\ ~ 1, at 
just about cr ph m 0.05. The part of this degradation that cannot 
be traced back to the effect by the gp-teim has to stem from the 
incorrect assessment of the redshift dependence of the GI signal, 
either due to the approximations inherent to the derivations of 
nulling or the suboptimal placement of the redshift at which the 
signal is nulled. 

Figure [8] also shows rt, for the power-law GI model with 
varying slopes. The behavior of r\, as a function of <x ph is in very 
good agreement with the results for the linear alignment model, 
rt, reaching about 0.03 for cr ph < 0.04, and up to 30 % higher 
values for cr p h = 0.1 in comparison with the linear alignment 
model. This suggests that at least the orders of magnitude of our 
results as well as the general conclusions drawn from a particu- 
lar GI model used in this work can be taken to robustly estimate 
the effects of a realistic GI contamination. 

Moreover, Fig. [8] upper panel, illustrates the performance of 
nulling using variant (B), i.e. renouncing on information about 
the form of the redshift probability distributions, and placing the 
redshift at which the signal is nulled at the centers of the photo- 
metric redshift bins zl , respectively. This version of nulling is 
capable of retaining marginally more information in the data, in 
particular for small cr ph . For high quality redshift information the 
reduction in bias is worse, rt, doubling approximately compared 
to variant (C). Again at cr ph ~ 0.04, rt, starts to increase, but more 
steeply, so that for cr ph > 0.04 nulling quickly becomes rather in- 
efficient. As for variant (C), the curves for rt, of the different GI 
models agree well in their functional form, but yield largely dif- 
ferent amplitudes. It is striking that the curve calculated without 
the gp-teim does not feature a distinct increase for large cr p h. 
This suggests that variant (B), when combined with the weight- 
ing scheme of Sect. 15. 31 could perform well also for larger pho- 
tometric redshift errors, as we will investigate in Sect. 17.11 




Fig. 10. Least squares sum R 2 as a function of nulling redshift 
Znuii- The results for photometric redshift bins one to eight corre- 
spond to the suite of gray-scale curves as given in the legend. 
Thin dashed lines represent the results for R 2 obtained when 
calculating the power spectrum without gp-teim. Since we used 
cr ph = 0.05 to produce this data, the minima of the latter curves 
are slightly offset. The local minima of these curves correspond 
to the optimal nulling redshifts z nu ii plotted in Fig. [12] Note that 
R 2 at the local minima is close to, but always larger than zero. 



redshift bins i have finite size as do the corresponding distribu- 
tions of true redshifts /? w (z). The nulling redshift z; is not fully 
specified anymore and has to be chosen appropriately. One rea- 
sonable choice is the median redshift of bin i, which corresponds 
to nulling variant (C). In this section we treat the zi as free pa- 
rameters and determine an optimal value z nu u. 

Hence, we aim at determining z; such that gU) (x(zd) fits 
Pqi(€) best since then nulling completely removes the intrin- 
sic alignment signal with g ( J } (x(zd) as zeroth-order weight. To 
this end, we compute the best fitting lensing efficiency, using the 
least squares sum of all background bins j, 



6.2. Analyzing optimal nulling redshifts 

The construction of nulling weights allows for a certain freedom 
in the choice of redshifts, which the photometric redshift bins 
are assigned to. We wish to investigate which choice of redshifts 
Zj, i.e. those redshifts where the signal is nulled, is optimal in the 
sense that the resulting zeroth-order nulling weights ( fl"8l best re- 
produce the redshift dependence of the GI signal, and thus effec- 
tively remove the systematic. The procedure to find such optimal 
nulling redshifts, denoted by z nu ib is outlined in the following. 
We emphasize that the calculation of z nu n merely constitutes a 
diagnostic tool, inapplicable to data, since the GI systematic has 
to be known exactly to do this. 

Judging from ( ITTb and the considerations in Sect. 14. II using 
the lensing efficiency g^ (x (£;)) as zeroth-order nulling weight 
is most effective in case of precise redshift information. In fact, 
in the limit of spectroscopic redshifts Ckf(Z;)) matches the red- 
shift dependence of the GI signal perfectly. In the approximation 
of infinitesimally narrow redshift probability distributions for the 
photometric redshift bins with lower median redshift, i.e. the ini- 
tial bins, the redshifts z, would mark the position, at which the 
GI signal would be perfectly removed. In reality, the photometric 



2 

R 2 (A P , zd = J] (ApF*®(€) - g U) (xm)) , (50) 

where the initial bin i and the angular frequency I are fixed. 
As default, we employ the values of P${€) for the central an- 
gular frequency bin, i.e. the bin with index Nc/2, which corre- 
sponds to I m 414. We warn that this is a crude approximation as 
the three-dimensional intrinsic alignment power spectrum varies 
significantly over the range of the integral in ©. The redshift- 
independent part of the dependence of the GI power spectrum 
on I can be absorbed into the free scaling Ap. The remaining 
^-dependence is accounted for by determining z nu u for different 
angular frequencies, see Fig.[T2lbelow. 

Since differences in the amplitude of P${€) and g^ } OKI,)) 
are not of interest, the dependence of R 2 on the scaling is 
eliminated by calculating the extremal Ap from the condition 
3R 2 /dA P = 0, yielding 

A P = y . (51) 
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Fig. 11. Determination of the optimal nulling redshift. Top panel: 
Results for cr ph = 0. The filled squares display the redshift de- 
pendence of the GI power spectrum, i.e. ApP${€) are plotted 
for different background bins j and fixed / and £. The lines corre- 
spond to the lensing efficiencies 0t(£,)) for the best-fitting 
respectively. The values for bin j of both lensing efficiencies and 
power spectra have been assigned to the median redshift of this 
bin, linearly interpolating in between for g^' 0Kz,)). The num- 
bers alongside the curves mark the initial bin number i. Bottom 
panel: Same as above, but for <x ph =0.1. Here we plot in ad- 
dition the results obtained by excluding the gp-term from the 
calculation of the GI signal as dashed curves and open squares, 
respectively. 



Now R 2 is computed for a wide range of making use of the 
fact that ( Bll reduces the problem to a one-dimensional mini- 
mization. The value of Zi that corresponds to the minimum least 
squares is then set as the optimal nulling redshift z nu u. 

In Fig.[l0]the least squares sum R 2 is plotted as a function of 
the Zi for a data set with <r ph = 0.05, using the downscaled linear 
alignment model to compute the GI power spectrum. Note that 
for high redshifts the lensing efficiency tends to zero, thereby 
implying an extremal value of Ap — 0. Thus, the least squares go 
to zero for high redshifts because a GI power spectrum, scaled 
to zero, fits a vanishing lensing efficiency perfectly. The optimal 
nulling redshift is therefore extracted from the well-defined local 
minima of R 2 , which can be clearly seen in Fig.fTUl 

The procedure to compute z nu n is illustrated by Fig.QT| The 
redshift dependence of the GI power spectra for initial bins 1 
to 3, and the corresponding best-fit lensing efficiencies are plot- 
ted, referring the values for bin j of both quantities to the me- 
dian redshift of distribution The curves corresponding 
to the lensing efficiency are obtained via linear interpolation of 
the set of (x(zd) with j = i + 1, .. ,N Z . For the case without 
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Fig. 12. Optimal nulling redshift Zm>u as a function of photomet- 
ric redshift dispersion cr ph . Plotted are the results for different GI 
signals, including the linear alignment model with and without 
gp-teim, and the power law model with slopes s = {0.1, 0.4, 0.7). 
Solid curves correspond to z nu n for the linear alignment model, 
evaluated at the central angular frequency bin. Excluding the gp- 
term for this setup results in the dotted line. The gray areas indi- 
cate the range of z nu u for all intrinsic alignment models consid- 
ered, evaluated at the lowest and highest angular frequency bin 
each. In addition, the bin boundaries are shown as thick solid 
lines, while the median redshifts of the redshift probability dis- 
tributions are represented by thick dashed curves. 



photometric redshift errors, nulling redshifts can be found such 
that the resulting lensing efficiencies almost exactly fit the red- 
shift dependence of the GI power spectrum, so that in this case 
the approximation of infinitesimally narrow initial bins has little 
negative influence on the nulling performance. 

In the bottom panel of Fig.QT]we plot results for a large red- 
shift uncertainty of cr p h =0.1. Deviations of the redshift depen- 
dence of the GI signal from the best-fitting g^' (x(S.i)) are visible 
particularly for the lowest bin considered, i.e. for 7 = 1+1, and 
the bin at the highest redshift. The latter effect can be ascribed 
to the large width and asymmetry of the corresponding redshift 
probability distribution, see Fig.Q] The GI power spectrum shifts 
to higher values for bins j = i + 1 and cr ph » because of the 
gp-teim, which has the strongest contribution for adjacent pho- 
tometric redshift bins. Accordingly, the GI signal is significantly 
smaller for bins j = i + 1 if calculated without the gp-teim, and 
a lensing efficiency that fits the GI term much better, i.e. with 
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smaller R 2 (A P , z nu ii), can be found. Since '({) without the gp- 
term is generally best-fit by lensing efficiencies with higher z, 
than the power spectrum with gp-term, R 2 attains its minimum 
at higher z;, as is also evident from Fig.fTUl 

We repeat the determination of z n uii for all relevant initial 
bins, for the GI power spectrum at the lowest and highest angu- 
lar frequency bin in addition to the central one, and varying <r p h, 
our findings being depicted in Fig. [12] The gray regions cover 
the range of resulting curves for all four considered GI models 
(linear alignment; power law with s = {0.1,0.4,0.7}), evaluated 
at the lowest, central, and highest angular frequency bin each. 
Hence, these regions should mark to good accuracy the possible 
range of Znuii for any GI signal. In addition, curves representing 
the photometric redshift bin boundaries, the median redshifts of 
the distributions, and z nu n for the linear alignment model, com- 
puted for the central angular frequency bin with and without the 
gp-term are shown. 

In the regime of <Tph in which nulling performs excellently, 
i.e. cr ph < 0.04 (Fig. [8]), we find that the median redshifts are 
very close to the optimal nulling redshifts. Only for the lowest 
initial bin the allowed region of z nu u is broader, but still well-fit 
by the median redshift. Using the central redshifts Zc as nulling 
redshifts proves to be a fair approximation if the underlying red- 
shift probability distributions are not too asymmetric, as is for 
instance the case in our model of redshift distributions except 
for the distributions at the lowest and highest median redshift. 
These results confirm that variant (C) with nulling at the median 
redshifts yields indeed the best performance for a survey with 
small redshift dispersion. As can also be concluded from Fig.[T2l 
variant (B) works only slightly less effectively in this case. 

Regarding the behavior of the curves for large cr p h, z nu u 
considerably deviates from its values at small redshift errors, 
partially crossing the original photometric redshift bin bound- 
aries. While the median redshifts at least qualitatively follow the 
change in z nu ii with increasing cr p h by trend, the Zc of nulling 
variant (B) represent the actual z nu n even worse, as the results 
of Fig. [8] verify. The drop of ZnaU for the higher initial bins can 
almost entirely be explained by the gp-teim contribution. Its re- 
moval produces curves that keep close to the median redshifts, 
see Fig. [12] The remaining offsets of z nu u from the median red- 
shifts presumably originate from the variation of the integrand 
in (O across the broad distribution of the initial bins. However, 
since we compute the GI power spectrum only for single {-bins, 
the accuracy in the calculation of z nu u is limited. This holds true 
in particular for broad redshift distributions, as the widening of 
the gray regions, which is dominated by the scatter of the curves 
computed for different angular frequency bins, indicates. 

7. Influence of further characteristics of the redshift 
distribution 

7.1. Catastrophic outliers 

Future cosmic shear data, in particular for space -based surveys 
incorporating infrared bands ( Ab dalla et al.ll2007t) . will be able 
to rely on exquisite multi-band photometry, so that the fraction of 
catastrophic failures in the assignment of photometric redshifts 
will be kept at a very low level. A significant fraction of outliers 
in the redshift probability distributions would have a devastating 
effect on the removal of intrinsic alignment. For instance, con- 
sider a photometric redshift bin i at relatively high redshift. If 
it mistakenly contains galaxies whose true redshift is low, these 
would produce a strong GI signal when correlated with another 
high redshift background bin j. 



We compute the ratios rp and r\, now as functions of both cr ph 
and / cat , keeping the offset fixed at A z = 1.0. To judge the effect 
of outliers, it is important to note that / cat is not the true fraction 
of catastrophics, but r out as given by Fig. [2] Results for rp and 
rb are given in Fig. [13] for the linear intrinsic alignment model 
as the systematic, again downscaled by a factor of five. The left 
column shows results for nulling variant (C), the right column 
for variant (B), where in the bottom four panels the weighting 
scheme (|49b has been applied in addition. 

Inspecting the plots obtained without the weighting scheme 
first, one sees that as before, rp varies only little with the pa- 
rameters of photometric redshift, varying around 45 % for vari- 
ant (C). Variant (B) retains slightly more information than (C), 

1. e. around 50 %, which is in accordance with Figs.|4] and [8] 
Moreover, the fraction of catastrophic outliers indeed has a 
strong effect on the ability of nulling to remove the GI system- 
atic. Variant (C) performs well for high quality redshifts, but r/, 
increases significantly when increasing both <x ph and / cat , reach- 
ing rt ~ 0.5 for <x ph =0.1 and / cat = 0.1. Contrary to this, 
variant (B) proves to be much more robust against catastrophic 
outliers, still reducing the average bias by about a factor of ten 
for cr p h < 0.05 and any outlier fraction considered here. The 
performance merely degrades for large cr p h, but remains below 
rb ~ 0.3 in the case of the linear alignment model, see also Fig. [8] 

Introducing the weighting scheme for adjacent photometric 
redshift bins to the nulling technique modifies its performance 
substantially. For cr p h < 0.05 the changes are small, as expected. 
The larger cr ph , the more adjacent bin combinations are down- 
weighted, the larger the decrease in rf. The ratio rp drops by up 
to 0. 15 in the case of variant (C). At the same time the region in 
which rb is desirably small extends siginificantly towards larger 
<Tph. While this improvement is mostly relevant in the regime of 
low outlier rates for variant (C), variant (B) achieves < 0.1 
across the full range of <x p h and / cat considered. In other words, 
nulling can reduce the GI contamination by at least a factor of 
10 for all realistic configurations of redshift errors, given that the 
GI systematics we consider should be close to a worst case. The 
even stronger biases caused by the power law models (Fig. [8} are 
mostly due to the g/>term and can thus also be expected to curb 
down on applying the weighting scheme. 

To summarize our findings, we present our different error 
measures for three exemplary models in Table 17.11 The three 
sets represent surveys with high (set 1), medium (set 2), and low 
(set 3) quality redshift information, with parameters cr ph and / cat 
as given in the table. According to the results of the foregoing 
sections we use variant (C) for the high-quality set 1, and variant 
(B) for the other configurations, always including the weighting 
scheme for adjacent photometric redshift bins. For all sets, the 
survey is divided into N z = 10 redshift bins, the downweighted 
linear alignment model is used as GI signal, and A- = 1.0 is 
fixed. For all these models nulling retains about 45 % of the 
statistical power in terms of rp and depletes the GI contamina- 
tion by about a factor of 30. Figure [14] shows two-dimensional 
marginalized 2<x-error contours before and after nulling for set 

2. Note that since we did not add any priors to the Fisher matrix 
calculation, negative values for e.g. Qb are not excluded. 

7.2. Uncertainty in redshift distribution parameters 

The parameters characterizing the redshift distributions are de- 
termined from data, for instance by making use of a spectro- 
scopic subsample of galaxies. Hence, there is also uncertainty 
in the shape of the /? w (z), or equivalently, in the parameters de- 
scribing the redshift distributions such as z me d, or cr p h. The per- 
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Fig. 13. Ratios of average 
statistical and systematic 
errors rp and r/, as a func- 
tion of photometric red- 
shift dispersion cr ph and 
outlier fraction / cat . The 
offset of the outlier dis- 
tributions has been fixed 
at A z = 1. As systematic 
the linear intrinsic align- 
ment model, downscaled 
by a factor of five, has 
been employed. To ob- 
tain the bottom four pan- 
els, the calculations were 
repeated, now including 
the weighting scheme out- 
lined in Sect.1531 Left: 
Results for nulling which 
takes into account knowl- 
edge of the redshift prob- 
ability distributions, i.e. 
variant (C). In panels 1 
and 3 rp is shown, and in 
panels 2 and 4 r\,. Right: 
Same as before, but for 
nulling with referencing 
to the centers of the pho- 
tometric redshift bins, i.e. 
variant (B). 



0.02 0.04 0.06 0.08 0.10 



0.02 0.04 0.06 0.08 0.10 



formance of variant (C), which explicitly takes into account in- formation about the redshift distributions, will clearly be affected 

by this uncertainty, as shall be investigated in the following. 
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Fig. 14. Parameter constraints before and after nulling. Shown are the two-dimensional marginalized 2cr-errors for the original data 
set as solid curves and for the nulled data set as dotted curves. The fiducial parameter values are marked by the crosses. The survey 
has been divided into N z = 10 photometric redshift bins. Photometric redshift errors are characterized by <x ph = 0.05, / cat = 0.05, 
and A, = 1 .0. As systematic the linear alignment model, downscaled by a factor of five, has been employed. The nulling was done 
using variant (B), including the weighting scheme outlined in Sect. 15. 31 



We quantify the uncertainty in the redshift distributions in 
terms of the median redshift, allowing for a Gaussian scatter with 
width a- Zmi around the true value of z me( j for every redshift bin. 
Then Monte-Carlo samples of sets of z me d are drawn from these 
distributions and used to subsequently compute nulling weights, 
do the Fisher analysis of the nulled data set, and obtain the ratio 
rj. As input we use a set of power spectra calculated for N z = 10 
bins with cr ph = 0.03 and without catastrophic outliers. For high- 
quality redshift information that nulling variant (C) is suited for 
one can adopt the requirements on cr Zmcd of planned satellite mis- 
sions like Euclid, targeting cr ZnaS = 0.001 and demanding at 
least cr, mcd = 0.002. Drawing 5000 Monte-Carlo samples each 
for both of these values of cr Znied produces the distributions of rt, 
displayed in Fig. [TBI 

For each histogram a value ?t, is marked, defined such that 
rt, < rt, for 90% of all samples. We find ?t, ~ 0.010 for 



o-, mcd = 0.001 and?;, * 0.019 for cr Zml = 0.002. The distributions 
peak at the value rt, * 0.003, which results from using the z m ed as 
nulling redshifts (see Fig. [8]). Given a non-vanishing photomet- 
ric redshift error, z me d is not necessarily the optimal choice, and 
indeed samples with rt, < 0.003 exist, although the histograms 
decline rapidly for small rt,. The distribution for cr- mcd = 0.002 is 
much shallower and decreases only slowly for rt, > 0.003, result- 
ing in a ft, about twice as big as for cr Zmi = 0.001. Hence, nulling 
variant (C) requires knowledge of the form of the redshift distri- 
bution comparable to the planned goals of future satellite mis- 
sions to fully demonstrate its potential. Any moderate deviation 
of the nulling redshifts from its optimum, approximated by the 
Zmed, results in a significant increase in residual bias. 

On the other hand, nulling variant (B) does not rely on de- 
tailed knowledge about the p ( '\z) and performed well over a 
wide range of redshift distribution characteristics, but only when 
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Table 3. Errors on cos- 
mological parameters 
for three exemplary 
data sets with different 
photometric redshift 
errors. Top: Ratios rp 
and rt, for the three 
data sets considered. 
Moreover, the param- 
eters specifying the 
photometric redshift 
errors and the nulling 
variant used are given. 
The offset of outliers 
is fixed at A z - 1.0 
for all sets. The linear 
alignment model has 
been used through- 
out as systematic, as 
well as the weighting 
scheme of Sect. 15. 31 
Note that set no. 2 is 
the underlying data for 
the results of Fig. [14] 
Bottom: Marginalized 
statistical errors cr, 
biases b, total errors 
<x tot , and b K \ for every 
cosmological param- 
eter, shown for both 
original and nulled data 
sets. Besides, the ratios 
of statistical errors and 
biases before and after 
nulling are given. 
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Fig. 15. Distribution of r/, for 5000 Monte-Carlo samples of the 
set of z me d, using a model with cr ph = 0.03 and no catastrophic 
outliers. The black hatched distribution was obtained for a scat- 
ter of o" Zmed = 0.001, the gray distribution for o~ Zlml = 0.002. The 
vertical lines mark the limit which is chosen such that ri, < ft, 
for 90 % of all samples. 



including the Gaussian weighting scheme of adjacent redshift 
bins. The latter procedure does depend on the form of the red- 



shift distributions to a certain extent as the width of the weight 
should be chosen such that the Gaussian covers the range of 
overlap between the redshift distributions, which in turn depends 
on <x p h. However, general information about the width of red- 
shift distribution is mandatory for all upcoming cosmic shear 
surveys. Since the width of the Gaussian in d49b can in principle 
be chosen arbitrarily, one can always adjust this width to safely 
suppress the g/?-term. 

8. Summary & conclusions 

In this paper we investigated the performance of the nulling tech- 
nique as proposed by JS08, designed to geometrically eliminate 
the contamination by gravitational shear-intrinsic ellipticty cor- 
relations. In the presence of realistic photometric redshift infor- 
mation and errors we considered both the information loss due 
to nulling and the amount of residual bias. We suggested sev- 
eral modifications and improvements to the original technique, 
which we summarize by providing a recipe on how to apply 
nulling to a cosmic shear tomography data set. 

(1) Decide on which variant of nulling is best suited for the 
data set. If the data has precise information about the redshift 
distributions, and if these distributions have a small scatter and 
negligible outlier fraction, then variant (C), which takes into ac- 
count this information, should be chosen. Otherwise variant (B) 
is preferable, if combined with a Gaussian downweighting of 
combinations of adjacent photometric redshift bins. This weight- 
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ing scheme is necessary since overlapping redshift distributions 
can cause a swap of foreground and background galaxies, which 
produces a GI signal that cannot be controlled by means of 
nulling. Both variants perform considerably better than the orig- 
inal referencing suggested by JS08. 

(2) Calculate the nulling weights, depending on the variant 
chosen. This work defines these weights such that nulling can 
be interpreted as an orthonormal transformation of the cosmic 
shear data vector. Since the weights are composed of comoving 
distances, one has to assume a cosmology to compute them. An 
incorrect choice of parameters affects the GI removal and could 
in principle cause an even stronger bias on parameter estimates. 
We showed that any reasonable choice of cosmological parame- 
ters will produce equally suited nulling weights - one could even 
start with the resulting, largely biased parameters of the analysis 
of the original data set. Iteratively using the parameter estimates 
as input for a renewed nulling analysis renders the final results 
independent of any initial assumptions. 

(3) Compute nulled cosmic shear measures from the nulling 
weights and the tomography measures available. As nulling does 
not depend on angular scales, any measure such as the shear cor- 
relation functions or the aperture mass dispersion are suited. The 
number and size of photometric redshift bins should be chosen 
such that the overlap of the corresponding redshift distributions 
is kept at a minimum. Although nulling reduces the GI signal 
also for a division into 5 bins, we found that N, > 10 is required 
to achieve good performance. Auto-correlations should be ex- 
cluded from the analysis because of the potential contamination 
by an II signal. Applying the Gaussian weighting scheme will 
also reduce the II contamination in shear measures of adjacent 
photometric redshift bins. 

Performing a likelihood analysis with the nulled data should 
then yield parameter constraints that have a low residual bias due 
to intrinsic alignment contributions. However, we outlined that 
nulling inevitably reduces the information content in the data, 
even if spectroscopic redshifts were available. We demonstrated 
that lensing information, integrated over wide redshift ranges, is 
eliminated together with the GI term, which can finally be traced 
back to the distinct, but still similar dependence on redshift of 
the lensing and GI signal. In terms of our figure of merit Tf we 
found that of the order 50 % of the statistical power is lost. The 
loss decreases for larger N z , so that in contrast to a lensing-only 
analy sis N 7 » 5 is desirabl e, which is in accordance with earlier 
work dBridle & Kindl2007l JS08). 

In this paper we have not exploited any feature of intrin- 
sic alignments apart from its dependence on redshift. However, 
observations suggest that the stronge st intrinsic alignment sig- 
nal stems from lu minous galaxies (fMandelba um et ail 12006; 
iHirataetalJ l200l . Photometric redshift estimates for these 
brigh t galaxies usually have a much smaller scatter (lllbert et alJ 
2009), so that nulling may work better on this important subset. 
Thus, our conclusions on the performance of the nulling tech- 
nique should be conservative. 

Given excellent redshift information, nulling variant (C) re- 
duces the bias, averaged over all parameters considered as de- 
fined in d44l) . by at least a factor of 100. To achieve this goal, 
stringent conditions like cr ph < 0.03, a negligible fraction of 
catastrophic outliers, and an uncertainty in the median redshift 
°zmed ~ 0.001 hold. Even future space-based surveys will ful- 
fill these requirements only for a brighter subsample of galax- 
ies (which are expected to have the strongest intrinsic alignment 
signal though), but still this nulling version could serve as a valu- 
able consistency check. To suppress the GI signal by a factor of 
about 20, the conditions are moderately released, in particular on 



Cph, in case the Gaussian weighting is used. Moreover, we deter- 
mined optimal nulling redshifts, demonstrating that for accurate 
redshift information variant (C) is close to the best configuration 
possible in this geometric approach. 

Throughout the considered parameter plane, spanned by 
/cat < 0.1 (corresponding to a true outlier fraction of < 6%) 
and cr ph < 0.1, the nulling version based on variant (B) was 
capable of reducing the average bias by at least a factor of 10. 
Consequently, the requirements on photometric redshift parame- 
ters are low in this case. Merely a number N z > 10 of photomet- 
ric redshift bins, for which the width of the underlying redshift 
distributions should be known, is demanded - readily achieved 
by the majority of future cosmic shear surveys. Although we 
showed that the functional behavior of the residual bias is similar 
for all considered models, the values of the residual bias depend 
on the actual form of the GI signal. Since all models considered 
in this work produce severe parameter biases, we have further 
reason to believe that the numbers for the performance of the 
nulling technique given above should be understood as conser- 
vative. 

We have neglected the contamination by the II signal in all 
our considerations, arguing that the nulling could be preceded 
by an appropriate II removal technique. While for disjoint pho- 
tometric redshift bins the II signal does not appear in the trans- 
formed data at all, it was demonstrated that, for realistic situa- 
tions, ignoring the II term may cause a significant contamination 
of a subset of the nulled power spectra. On the other hand, this 
restriction of the II signal to certain nulled power spectra only 
could also allow for a removal of II after nulling. In any case, 
the ultimate goal is a combined geometrical treatment of all in- 
trinsic alignment contributions, which is subject to forthcoming 
work. 

Although we sampled only a fraction of the huge parameter 
space spanned by the various photometric redshift parameters, 
GI models, and nulling variants, it should be possible to draw a 
wide range of conclusions from this work. For instance, a rele- 
vant question is how a cosmic shear data set should be binned 
in order to remove intrinsic alignment and keep a maximum of 
information. The bin boundaries should be chosen such that the 
overlap of the corresponding redshift distributions is minimal, 
as long as the distributions do not become too asymmetric. Re- 
inspecting Fig.|6l the number of bins should be as big as the pho- 
tometric redshift scatter allows, i.e. the width of the bins should 
not become smaller than about <x p h(l + z) since otherwise no 
more information is added. As our results show, the photometric 
redshift scatter does not necessarily limit the level to which the 
GI signal can be eliminated, but then it places strong bounds on 
the remaining power to constrain cosmological parameters in the 
nulled data set, see Fig. [13] 

We emphasize that, in spite of defining GI signals to quan- 
tify the bias removal, the nulling technique itself does not rely 
on any information about intrinsic alignment except for the well- 
known redshift dependence of the GI term. In principle, nulling 
is also applicable to data sets in which the GI contribution dom- 
inates over lensing. Provided a sufficient suppression, it would 
be possible to recover the cosmic shear signal by nulling the 
data. Besides, nulling is not restricted to cosmic shear at the two- 
point level. Concerning three-point statistics, gravitational shear- 
intrinsic ellipticity cross terms, Gil and GGI, may constit ute an 
even more serious contamination (Sembol oni et alJ l2008). The 
geometric principle of nulling can be applied to tomography bis- 
pectra and related real-space measures in a straightforward man- 
ner (Shi et al., in preparation). 
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Due to the significant information loss of nulling, this tech- 
nique is most probably not desirable as the standard GI removal 
tool for future surveys, so that the need for both an improved 
understanding of intrinsic alignment and high-performance re- 
moval techniques that take knowledge about the GI models into 
account persists. Still, with its very low level of input assump- 
tions, nulling serves as a valuable cross-check for these model- 
dependent techniques yet to be developed and as such can con- 
tribute to the credibility of cosmic shear as a powerful and robust 
cosmological probe. 
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Appendix A: Fisher matrix for a 
parameter-dependent data vector 

In the following we explicitly calculate the Fisher matrix for a 
data vector y, transformed according to ([25}, where the trans- 
formation T depends on the parameters to be determined. We 
closely follow the de rivation of the Fisher matrix presented in 
Tegmark et alj dl997l) . A comma notation is used to indicate 
derivatives with respect to parameters. 
For j> the Gaussian log-likelihood reads 

- In L y (y\p) = y In 2tt + - In det C y 

+\[y-y] T c; 1 [y-y] , (A.i) 

where we dropped the arguments of y and C v for notational con- 
venience. Again, the expectation value of a data vector is indi- 
cated by a bar over the corresponding variable name. Making use 
of the matrix identity In det C — tr In C, and defining the matrix 
D y = (y - y) (y - y) r , one arrives at 

- In L y (y\p) = ^ In 2n + 1 tr {in C, + Cy l D y } . (A.2) 

According to the derivation in Tegmar k et alJ dl997l) . the second 
derivative of iA.2i readfl 

-[inLyfylp)}^ = \tr{c- l C y4lv -C-'C y ^C-'D y 

J\* '>■■,, + <V '><■,,}■ (A.3) 

where the rules (lnC)^ = C~ l C tfl and (C -1 )^ = -C^'C^C -1 
were applied. The expectation value of (IA. 31 > yields the Fisher 

4 Note that there is a typo in Eq. (14) of iTegmark et al.1 dl997t) : A 
factor C 1 should be eliminated from the last term. 
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matrix, see the definition in (f23t . We compute the matrix D y and 
its derivatives in terms of the original data set, 

D y = TD x r ; (A.4) 
D yji = TjP % T + TD x r M - Tx,, (x - x) T T T -T(x-x) x\T ; 

D y,fi V = T^ V D X T T - [T^x^ + T^x^ + Tx^ v J (x - x) T T 
+TD X T T 4IV -T(x-x) (r^*,v + T,yX^ + Tx^yf 
+T M D x r v - Tx s (x - x) T T v - T M (x - x) x]X 
+Tx, lI x] v T T + T, v D x r M 

-Tx, v (x - x) T - J v (x - x) x^X + Tx, v x*X , 

where D x is defined in analogy to D y . Using (x) = x and 
(xx T ) = C x + xx T , we obtain the expectation values of the former 
quantities, 

(Dy) = TCX = C y ; (A.5) 

(Dy J) = T^CX + TCX,, ; 

(Dyjj = T^CX + TCX^ V + T^CXy + T v C x r M 

+t{x^ v +x, v x\)t t . 

With these expressions at hand we calculate the expectation 
value of ( lA.3t . 



{\nL y (y\p)\^ (A.6) 

= i tr (c; 1 (t v cx + TC Xt X + tcx v ) 

x c; 1 (t^cx + TC x$M r + TCX,,) 
-c; 1 [t, v cx + tc x ,x + tcx v ) c; 1 (r/.r + tcx) 

-C; 1 (T^CX + TC x$M r + TCX^Cy 1 (T, y CX + TCX V ) 

+Cy- 1 (T^ v cx + tcx^ + T^C x T y 

+T iY c x r fl + T(xjj? v + x, v x^) r)\ . 



Note that the first two terms in (I A. 31 > cancel due to (d v ) = C y . 
We now make extensive use of the fact that the trace is invariant 
under cyclic permutations of matrices. Then one readily finds 
that many terms in the first three lines of dA.6t cancel. Expanding 
C" 1 = T 7- C~ l T~ x , more terms cancel, either directly or after 
cyclic permutation. This way ( IA.6I ) reduces to 



= \ tt \c~ x C x ^C x ^ + C; 1 (x^ v + x Y xl) (A.7) 

+r~ 1 r /av + t^X' 1 - t- x t,x x t^ - t t ~ 1 fx~ 1 ti, 

The first two terms of this expression correspond to the Fisher 
matrix F* of the data vector x, see (l25l) . Finally, by employing 
in addition that trC T = trC and (C T Y l = (C~') T , one arrives at 

F^ = F; v + tr{lnr^ v . (A.8) 

If we apply the condition detT = 1, as required in Sect. 12. 21 we 
find trln T = In det T = 0, and hence, the Fisher matrices of the 
original data vector x and the transformed one y are equivalent. 
This result is in agreement with d27b . which, when transformed 
to log-likelihood, reads 



-lnL y (y\p) = lndetTO) - In L x (x\p) 
= tr {lnT(p)} 



(A.9) 



and reproduces dA.8b after taking derivatives and expectation 
value. Employing the further simplification that the original co- 
variance C x does not depend on the parameters, the Fisher matrix 
can be written as 



F^v = - tr [C x 1 (xjj.SY v + x,yX T Jj\ 

= l -tr{Cy'T{x^x] v + x, v x^)T T ) , 
which, after converting the trace to a sum, yields ( l28l i. 

Appendix B: Validity of the bias formalism 



(A. 10) 



As is evident from Sect. 13.31 a GI systematic that fits within the 
error bounds of current observations can attain values of similar 
order of magnitude as the lensing power spectrum. Besides, due 
to the similar dependence on geometry, see (0 and ©, the effect 
of adding a GI systematic acts similarly to a change of cosmo- 
logical parameters, in particular those determining the amplitude 
of the lensing power spectrum. Consequently, we expect the sys- 
tematic to produce a strong bias, possibly much larger than the 
statistical error bounds. While this does not hamper the perfor- 
mance of the nulling technique, it may render the bias formalism 
as given by d29l invalid. In the following we are going to derive 
the parameter bias from the log-likelihood, taking special care 
of approximations and the resulting limitations. 

Since we keep the assumption that the signal covariance Cp 
does not depend on the parameters to be determined, the calcula- 
tions can be directly done in terms of the^- 2 , whic h is then twice 
the lo g-likelihood. For a similar approach see e.g. lTaburet et al.l 
(2009). We define a fiducial data vector F f , i.e. the signal in ab- 
sence of systematic effects, and assume this signal to be contam- 
inated by a systematic P sys . A set of models P(p), depending on 
a set of parameters p, is fitted to the signal, where p i denotes 
the fiducial set of parameters such that P(p c ) = P l . Then the^ 2 
reads 



X\P) = J] {PAP) ~ P?) {C P % {P P {P) ~ Pf) 



where PjJ* = P a + Pj . Writing the unbiased^ as 



( P ) = 2(^)-^)(^ 1 L(w-^) 



(B.l) 



(B.2) 



one can expand (1B.U to yield 



X 



\p) = *V) +xi(p) - 2 J] p * s { c p% (w - p l) > ( B3 > 



a 43 



where p l produces the maximum likelihood (or minimum^ 2 ) in 
absence of a systematic. Since P{p c ) = P { , x 2 (p t ) contains only 
the systematic power spectrum and causes an irrelevant overall 
rescaling of the x 1 m parameter space. Hence, the modification 
of the x 1 due to the systematic is contained in the last term of 
(HO). It can shift the point of maximum likelihood and deform 
the likelihood in its vicinity, depending on both the parameters 
and the form of the systematic. 

Considering dB.U again, x 2 (P) can be written as a Taylor 
expansion around the fiducial set of parameters, 



x\p) 



X 



(Pi-p'i) 



4Z(»-*0^>-^M.<B.4) 
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Fig. B.l. Comparison of statistical errors and biases obtained by 
Fisher matrix and^ 2 calculations. Top panel: Ratio of bias over 
statistical error b le \ as a function of the scaling of the systematic 
A sys . Results for a 1 deg 2 survey are shown as black curves, and 
for a 100 deg 2 survey as gray curves. Bottom panel: Ratios of 
the statistical errors r' a and biases r'b as a function of the scal- 
ing of the systematic A sys . Solid lines correspond to r'a-, dashed 
lines to r'b- As above, results for a 1 deg 2 and a 100 deg 2 survey 
are shown as black and gray curves, respectively. Note that the 
curves for r'b almost completely overlap. 



where the subscript f indicates that the derivatives are evaluated 
at p c . Making again use of P(p f ) = P l , one obtains for the deriva- 
tives from (IB. 3b 
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Dividing ( IB. 6b by 2 yields the Fisher matrix, so that in the case 
of a biased x 1 one can define an equivalent to the Fisher matrix 
as 
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We want to determine the bias b = p b - p i , where p b is the 
point in parameter space where the biased x 2 attains its mini- 
mum. The biased parameter set p h is computed from ( IB. 4b . using 
the expansion up to second order, which results in 
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where the derivative of the^ 2 has been evaluated at p h . Provided 
that the biased Fisher matrix ( IB. 7b has an inverse, too, one can 
solve for the bias and obtain 
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If one assumes that the systematic is small such that the second 
term in ( IB. 7b becomes subdominant, (IB. 9b reproduces the known 
bias formula d29l . 

In summary, the differences in employing the exact likeli- 
hood/^ 2 formalism ( IB. lb or the Fisher matrix approach d28l|29l ) 
can be reduced to cutting the Taylor expansion in ( IB. 4b after the 
second order in p, and dropping the second term in ( IB. 7b . Both 
approximations are fair if the amplitude of the systematic and 
the bias it produces are sufficiently small. 

To quantify the validity of these approximations in the con- 
text of this work we create a cosmic shear tomography survey 
with N z = 10 redshift bins without photometric redshift errors. 
The GI signal is calculated via the linear intrinsic alignment 
model, with a free overall scaling of A sys to control the ampli- 
tude of the systematic. The original GI model corresponds to 
A sys = 1 . We use Q. m as the only parameter to be constrained, set- 
ting a fiducial value of 0.4 for this exemplary analysis. Thereby, 
as the GI signal biases Q m low, we allow for large biases in a 
range of still reasonable parameter values. To achieve a suitable 
magnitude of statistical errors, the survey size is set to 1 deg 2 and 
100 deg 2 , respectively, the remaining parameters kept at the val- 
ues given in Sect.[5] The exact errors are calculated via ( IB .It on 
a grid in parameter space with steps of 10~ 4 between Q m =0.1 
and O m = 0.5. While the minimum^- 2 is simply read off the 
grid values, the lcr-errors are computed by linear interpolation 
on the grid, with A;f 2 * 1 from the minimum for one degree of 
freedom. 

We define the ratios 
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where cr x i denotes the statistical error on Q. m obtained by the 
likelihood calculation, and where cr F is the statistical error re- 
sulting from the computation of the Fisher matrix. Likewise def- 
initions hold for the bias b x i and bp. In Fig. IB. li the ratios r' a and 
r'b are plotted as a function of A sys . Apart from uncertainties due 
to the finite grid resolution the results for both survey sizes agree 
very well, but since the bias does not depend on the survey size 
A, and c oc 1/ yA, the ratios of bias over statistical error differ 
by a factor of 10. Thus, the limits within which the bias formal- 
ism yields accurate results do not depend on this ratio. Instead, 
the deviations from the exact x 1 results are a function of the am- 
plitude of the systematic with respect to the original signal. 

For A sys = 1, i.e. the default GI signal, we find a deviation of 
the bias obtained by the Fisher matrix formalism of only 2.4 %, 
despite the strong systematic. The true bias is less than 10% 
larger throughout, even for a very large systematic that domi- 
nates the signal by far. In the analysis considered here, both the 
curvature of the GG power spectrum and the systematic power 
spectrum are negative, so that the second term in ( IB.7l i should in 
general be negative, too. Consequently, F' < F, causing ( IB. 9b to 
produce larger biases than (1291 1. which is evident in Fig. IB.ll 

If the amplitude of the systematic increases, the second term 
in (IB. 7b becomes more important, thereby leading to a scaling 
of the bias with less than A sys in (IB. 9b . Hence, the ratio of bi- 
ases can curb down for large A sys because the bias, as computed 
from J29l . continues to scale with A sys , an effect which is also 
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seen in the figure. A similar behavior may be expected from the 
inclusion of the third-order in (1B.41 > as it leads to a term with 
bias squared in ( IB.8I 1. thereby placing the term scaling with P sys 
under a square root when solving for b. 

In the presence of a bias a more accurate way to obtain statis- 
tical errors than using the original Fisher matrix would be via F' . 
As opposed to the Fisher matrix formalism, the statistical errors 
become dependent on the systematic. Inspecting ( 1B.7I I. errors 
scale linearly with A sys and should increase because of F' < F. 
Again Fig. IB. 1 [ demonstrates that this holds true to good approx- 
imation, yielding already a 8 % effect at A sys = 1. Downscaling 
the systematic to A sys = 0.2, the bias formalism should produce 
results that are very close to the full likelihood calculation, even 
for the full set of cosmological parameters. 



